Decontextualizing + Assumptions = Fallacies: And It s Worse for Big Data
|
|
- Myrtle Greer
- 8 years ago
- Views:
Transcription
1 Decontextualizing + Assumptions = Fallacies: And It s Worse for Big Data Michael Smithson Research School of Psychology The Australian National University
2 Setting the Scene This talk presents several commonplace myths about public data and problematic assumptions regarding models of such data. These problems are intensified and less corrigible in big data. The myths and assumptions fall under three headings: 1. Big data have integrity 2. Models of big data extract patterns that reflect underlying truths 3. Big data are better than small data
3 Big data have integrity? Accuracy? Data may be subject to recording distortion, recording errors, and/or measurement confounds. These may be worse for big data because big data often is an assemblage of multiple second-hand data sets taken out of context and used for purposes other than those originally intended. Distortion and error : Impact of shadow economies on official economic indicators (e.g., employment rates, inflation) Gaming the indicators (e.g., Australian universities and the ERA) Making it all up (e.g., the recent Canberra hospital records scandal) Measurement confounds: Differing or shifting criteria (e.g., definitions of crime, suicide in Catholic populations) Measurement contamination (e.g., webpage number of visits and dwell-times)
4 Big data have integrity? Precision? Big data often are sample data rather than population data, and the samples may not be representative of their referent populations. Nevertheless, decision makers and policy analysts usually treat sample data or estimates as though they are population data or estimates. Stability? Data often are not recorded just once, but re-recorded as better information becomes available or as errors are discovered. For example, in November 2012 the first official estimate of U.S. net employment increase was 146,000 new jobs. By the third revision that number had increased by 68% to 247,000. Completeness? Data collection schemes often are set up by groups who lack the necessary expertise. The Australian Transport Safety Bureau, e.g., collects lots of data on civil aviation flights that have resulted in an incident, but collects no data on incident-free flights.
5 Models of big data reveal underlying truths? Big data increasingly require automated data analysis, i.e., data mining. Data-mining has no quality control, beyond the assumptions built into its algorithms and post-hoc interpretations by humans. Spurious correlations There is no guarantee that a pattern (e.g., a correlation between two variables) uncovered by data-mining is meaningful or useful. Other unmeasured factors may render those correlations spurious. Autocorrelation may account for an apparent correlation over time. However, humans will make sense out of nearly anything.
6 Terms for uncertainty fall out of fashion in English-language books: As the plot below demonstrates, the terms ignorance", ignorant", unknown", uncertain", and falsehood" display a steady decline in relative frequency of occurrence in GoogleBooks, starting around 1830, and ending with a slight upturn at the start of the new century. logit Ignorance Ignorant Unknown Uncertain falsehood year
7 Terms for uncertainty fall out of fashion in English-language books: What is driving this? Could it be God? logit(god) year
8 Terms for uncertainty fall out of fashion in English-language books: It looks plausible; the correlations are very high. God Ignce. Ignt. Unkn. Uncrt. Falsh God Ignorance Ignorant Unknown Uncertain Falsehood Also, a search through the books containing such references reveals a potential link between mentions of these terms and references to God in the context of theological arguments.
9 Terms for uncertainty fall out of fashion in English-language books: However, The partial autocorrelation functions below suggest that all of these variables are strongly autocorrelated (AR(1) or AR(2) processes).
10 Terms for uncertainty fall out of fashion in English-language books: When autocorrelation is taken into account, the residual series no longer display strong correlations. The original correlations were inflated due to autocorrelation. God Ignce. Ignt. Unkn. Uncrt. Falsh God Ignorance Ignorant Unknown Uncertain Falsehood
11 Models of big data reveal underlying truths? Correlation ain t causation, but what if there s a time-lag? People tend to attribute causal status to X if X always precedes Y, and especially if they would like to attribute Y to X (e.g., my AFP experience). Nonstationarity Most data-mining procedures assume that the processes generating the data are stable over time (i.e., stationarity). This often may be untrue, and the changes in those processes unaccounted for.
12 Big data are better than small data? Big data can be big in at least three senses: Large samples, lots of different pieces of information, and more intense surveillance. These may not be unalloyed goods. Belief in small numbers Large samples indeed do give more accurate and precise estimates than small samples, ceteris paribus. But most non-statisticians don t know or understand this. Instead, they over-estimate how representative of a population a small sample is. More information can make us worse decision makers Several experimental studies have shown that people are more confident and make worse predictions when given additional, but irrelevant, data. This is especially problematic for unstructured fishing expeditions for data.
13 Big data are better than small data? Bigger data yield better models for predicting the future? One of the potential advantages of big data is long memory. Longer-term data-sets should give us a better model of the past, which in turn should enable us to better predict the future. But does it? Model inflation and nonstationarity pose problems here. Bigger data result in greater control? Bigger data clearly won t yield better control if underlying causes are not understood and/or are not malleable by us. Surveillance and data-gathering about people may destroy trust and privacy (which are means of control as well as social capital), and also may engender reactance among those under surveillance (and thus a loss of control).
Big Data, Socio- Psychological Theory, Algorithmic Text Analysis, and Predicting the Michigan Consumer Sentiment Index
Big Data, Socio- Psychological Theory, Algorithmic Text Analysis, and Predicting the Michigan Consumer Sentiment Index Rickard Nyman *, Paul Ormerod Centre for the Study of Decision Making Under Uncertainty,
More informationA STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH
205 A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH ABSTRACT MR. HEMANT KUMAR*; DR. SARMISTHA SARMA** *Assistant Professor, Department of Information Technology (IT), Institute of Innovation in Technology
More informationSensitivity of an Environmental Risk Ranking System
Sensitivity of an Environmental Risk Ranking System SUMMARY Robert B. Hutchison and Howard H. Witt ANSTO Safety and Reliability CERES is a simple PC tool to rank environmental risks and to assess the cost-benefit
More informationQuality Factors in Big Data and Big Data Analytics and Their Legal Implications
Quality Factors in Big Data and Big Data Analytics and Their Legal Implications Roger Clarke Xamax Consultancy, Canberra Visiting Professor in Computer Science, ANU and in Cyberspace Law & Policy, UNSW
More informationComments of the World Privacy Forum To: Office of Science and Technology Policy Re: Big Data Request for Information. Via email to bigdata@ostp.
3108 Fifth Avenue Suite B San Diego, CA 92103 Comments of the World Privacy Forum To: Office of Science and Technology Policy Re: Big Data Request for Information Via email to bigdata@ostp.gov Big Data
More informationIntegrated Resource Plan
Integrated Resource Plan March 19, 2004 PREPARED FOR KAUA I ISLAND UTILITY COOPERATIVE LCG Consulting 4962 El Camino Real, Suite 112 Los Altos, CA 94022 650-962-9670 1 IRP 1 ELECTRIC LOAD FORECASTING 1.1
More informationBusiness Intelligence and Decision Support Systems
Chapter 12 Business Intelligence and Decision Support Systems Information Technology For Management 7 th Edition Turban & Volonino Based on lecture slides by L. Beaubien, Providence College John Wiley
More informationABSTRACT OF THE DOCTORAL THESIS BY Cătălin Ovidiu Obuf Buhăianu
ABSTRACT OF THE DOCTORAL THESIS BY Cătălin Ovidiu Obuf Buhăianu Thesis submitted to: NATIONAL UNIVERSITY OF PHYSICAL EDUCATION AND SPORTS, Bucharest, Romania, 2011 Thesis Advisor: Prof. Dr. Adrian Gagea
More informationText Mining in JMP with R Andrew T. Karl, Senior Management Consultant, Adsurgo LLC Heath Rushing, Principal Consultant and Co-Founder, Adsurgo LLC
Text Mining in JMP with R Andrew T. Karl, Senior Management Consultant, Adsurgo LLC Heath Rushing, Principal Consultant and Co-Founder, Adsurgo LLC 1. Introduction A popular rule of thumb suggests that
More information3. Data Analysis, Statistics, and Probability
3. Data Analysis, Statistics, and Probability Data and probability sense provides students with tools to understand information and uncertainty. Students ask questions and gather and use data to answer
More informationApplied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets
Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets http://info.salford-systems.com/jsm-2015-ctw August 2015 Salford Systems Course Outline Demonstration of two classification
More informationObjectivity and the Measurement of Operational Risk. Dr. Lasse B. Andersen
Objectivity and the Measurement of Operational Risk Dr. Lasse B. Andersen Background - The OpRisk Project Societal Safety & Risk Mng. Research group: 18 professors, 15 assoc. professors, 25 Ph.D students,
More informationCollaborations between Official Statistics and Academia in the Era of Big Data
Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI vnn@umich.edu What
More informationForecasting. Sales and Revenue Forecasting
Forecasting To plan, managers must make assumptions about future events. But unlike Harry Potter and his friends, planners cannot simply look into a crystal ball or wave a wand. Instead, they must develop
More informationOrganizing Your Approach to a Data Analysis
Biost/Stat 578 B: Data Analysis Emerson, September 29, 2003 Handout #1 Organizing Your Approach to a Data Analysis The general theme should be to maximize thinking about the data analysis and to minimize
More informationREFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION
REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION Pilar Rey del Castillo May 2013 Introduction The exploitation of the vast amount of data originated from ICT tools and referring to a big variety
More informationWorkshop Discussion Notes: Housing
Workshop Discussion Notes: Housing Data & Civil Rights October 30, 2014 Washington, D.C. http://www.datacivilrights.org/ This document was produced based on notes taken during the Housing workshop of the
More informationHow to Ensure Adequate Retirement Income from DC Pension Plans
ISSN 1995-2864 Financial Market Trends OECD 2009 Pre-publication version for Vol. 2009/2 Private Pensions and 0B the Financial Crisis: How to Ensure Adequate Retirement Income from DC Pension Plans Pablo
More informationData Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC
Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep Neil Raden Hired Brains Research, LLC Traditionally, the job of gathering and integrating data for analytics fell on data warehouses.
More informationSmarter Planet evolution
Smarter Planet evolution 13/03/2012 2012 IBM Corporation Ignacio Pérez González Enterprise Architect ignacio.perez@es.ibm.com @ignaciopr Mike May Technologies of the Change Capabilities Tendencies Vision
More informationManagement Solution. Key Criteria for Maximizing Value and Reducing Risk. Author: Mark Bouchard WHITE PAPER
WHITE PAPER Demand More from Your Log Management Solution Key Criteria for Maximizing Value and Reducing Risk Author: Mark Bouchard 2009 AimPoint Group, LLC. All rights reserved. Introduction Every IT
More informationFiduciary Duty in Support of Responsible Investment
CONVENING REPORT Fiduciary Duty in Support of Responsible Investment January 14, 2015 Introduction On January 14, 2015, the Initiative for Responsible Investment held a Convening to discuss Fiduciary Duty
More informationTOWARD A DISTRIBUTED DATA MINING SYSTEM FOR TOURISM INDUSTRY
TOWARD A DISTRIBUTED DATA MINING SYSTEM FOR TOURISM INDUSTRY Danubianu Mirela Stefan cel Mare University of Suceava Faculty of Electrical Engineering andcomputer Science 13 Universitatii Street, Suceava
More informationFairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
More informationMultiple Regression: What Is It?
Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in
More informationRecapturing CLIs. How a Diversified Data Strategy Can Help Card Issuers Restore Credit Line Increases and Boost Revenue. Michael Blix Analytic Expert
Recapturing CLIs How a Diversified Data Strategy Can Help Card Issuers Restore Credit Line Increases and Boost Revenue Michael Blix Analytic Expert September 2013 Table of Contents 1 The CLI conundrum
More informationTIPS DATA QUALITY STANDARDS ABOUT TIPS
2009, NUMBER 12 2 ND EDITION PERFORMANCE MONITORING & EVALUATION TIPS DATA QUALITY STANDARDS ABOUT TIPS These TIPS provide practical advice and suggestions to USAID managers on issues related to performance
More informationDemand (Energy & Maximum Demand) Forecast - IRP 2010 O Parameter Overview sheet
Demand (Energy & Maximum Demand) Forecast - IRP 2010 O Parameter Overview sheet This sheet is to be used as the primary stakeholder engagement tool. This document provides the information that will allow
More informationBig Trouble. Does Big Data spell. for Lawyers? Presented to Colorado Bar Association, Communications & Technology Law Section Denver, Colorado
Does Big Data spell Big Trouble for Lawyers? Paul Karlzen Director HR Information & Analytics April 1, 2015 Presented to Colorado Bar Association, Communications & Technology Law Section Denver, Colorado
More informationMark Elliot October 2014
Final Report on the Disclosure Risk Associated with the Synthetic Data Produced by the SYLLS Team Mark Elliot October 2014 1. Background I have been asked by the SYLLS project, University of Edinburgh
More informationInsightful Analytics: Leveraging the data explosion for business optimisation. Top Ten Challenges for Investment Banks 2015
Insightful Analytics: Leveraging the data explosion for business optimisation 09 Top Ten Challenges for Investment Banks 2015 Insightful Analytics: Leveraging the data explosion for business optimisation
More informationCustomer Perception and Reality: Unraveling the Energy Customer Equation
Paper 1686-2014 Customer Perception and Reality: Unraveling the Energy Customer Equation Mark Konya, P.E., Ameren Missouri; Kathy Ball, SAS Institute ABSTRACT Energy companies that operate in a highly
More informationComparing return to work outcomes between vocational rehabilitation providers after adjusting for case mix using statistical models
Comparing return to work outcomes between vocational rehabilitation providers after adjusting for case mix using statistical models Prepared by Jim Gaetjens Presented to the Institute of Actuaries of Australia
More informationStatistics 215b 11/20/03 D.R. Brillinger. A field in search of a definition a vague concept
Statistics 215b 11/20/03 D.R. Brillinger Data mining A field in search of a definition a vague concept D. Hand, H. Mannila and P. Smyth (2001). Principles of Data Mining. MIT Press, Cambridge. Some definitions/descriptions
More informationDecision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010
Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010 Ernst van Waning Senior Sales Engineer May 28, 2010 Agenda SPSS, an IBM Company SPSS Statistics User-driven product
More information4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4
4. Simple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/4 Outline The simple linear model Least squares estimation Forecasting with regression Non-linear functional forms Regression
More informationHow to Prepare for your Deposition in a Personal Injury Case
How to Prepare for your Deposition in a Personal Injury Case A whitepaper by Travis Mayor, Attorney If you have filed a civil lawsuit in your personal injury case against the at fault driver, person, corporation,
More informationPascal is here expressing a kind of skepticism about the ability of human reason to deliver an answer to this question.
Pascal s wager So far we have discussed a number of arguments for or against the existence of God. In the reading for today, Pascal asks not Does God exist? but Should we believe in God? What is distinctive
More informationResearch Design. Recap. Problem Formulation and Approach. Step 3: Specify the Research Design
Recap Step 1: Identify and define the Problem or Opportunity Step 2: Define the Marketing Problem Management Problem Focus on symptoms Action oriented Marketing Problems Focus on causes Data oriented Problem
More informationTime Series Analysis
JUNE 2012 Time Series Analysis CONTENT A time series is a chronological sequence of observations on a particular variable. Usually the observations are taken at regular intervals (days, months, years),
More informationWhy do statisticians "hate" us?
Why do statisticians "hate" us? David Hand, Heikki Mannila, Padhraic Smyth "Data mining is the analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data
More informationAlex Vidras, David Tysinger. Merkle Inc.
Using PROC LOGISTIC, SAS MACROS and ODS Output to evaluate the consistency of independent variables during the development of logistic regression models. An example from the retail banking industry ABSTRACT
More informationUsing Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
More informationThe Decline of the U.S. Labor Share. by Michael Elsby (University of Edinburgh), Bart Hobijn (FRB SF), and Aysegul Sahin (FRB NY)
The Decline of the U.S. Labor Share by Michael Elsby (University of Edinburgh), Bart Hobijn (FRB SF), and Aysegul Sahin (FRB NY) Comments by: Brent Neiman University of Chicago Prepared for: Brookings
More informationPower & Water Corporation. Review of Benchmarking Methods Applied
2014 Power & Water Corporation Review of Benchmarking Methods Applied PWC Power Networks Operational Expenditure Benchmarking Review A review of the benchmarking analysis that supports a recommendation
More informationABA. History of ABA. Interventions 8/24/2011. Late 1800 s and Early 1900 s. Mentalistic Approachs
ABA Is an extension of Experimental Analysis of Behavior to applied settings Is not the same as modification Uses cognition in its approach Focuses on clinically or socially relevant s Is used in many
More informationThe Future of the Advanced SOC
The Future of the Advanced SOC Developing a platform for more effective security management and compliance Steven Van Ormer RSA Technical Security Consultant 1 Agenda Today s Security Landscape and Why
More informationExecutive Summary. Summary - 1
Executive Summary For as long as human beings have deceived one another, people have tried to develop techniques for detecting deception and finding truth. Lie detection took on aspects of modern science
More informationBetter decision making under uncertain conditions using Monte Carlo Simulation
IBM Software Business Analytics IBM SPSS Statistics Better decision making under uncertain conditions using Monte Carlo Simulation Monte Carlo simulation and risk analysis techniques in IBM SPSS Statistics
More informationThe Four-Step Guide to Understanding Cyber Risk
Lifecycle Solutions & Services The Four-Step Guide to Understanding Cyber Risk Identifying Cyber Risks and Addressing the Cyber Security Gap TABLE OF CONTENTS Introduction: A Real Danger It is estimated
More informationInnovations and Value Creation in Predictive Modeling. David Cummings Vice President - Research
Innovations and Value Creation in Predictive Modeling David Cummings Vice President - Research ISO Innovative Analytics 1 Innovations and Value Creation in Predictive Modeling A look back at the past decade
More informationA Note on the Optimal Supply of Public Goods and the Distortionary Cost of Taxation
A Note on the Optimal Supply of Public Goods and the Distortionary Cost of Taxation Louis Kaplow * Abstract In a recent article, I demonstrated that, under standard simplifying assumptions, it is possible
More informationApplication of Predictive Model for Elementary Students with Special Needs in New Era University
Application of Predictive Model for Elementary Students with Special Needs in New Era University Jannelle ds. Ligao, Calvin Jon A. Lingat, Kristine Nicole P. Chiu, Cym Quiambao, Laurice Anne A. Iglesia
More informationBig Data: Uses and Limitations
Big Data: Uses and Limitations Nathaniel Schenker Associate Director for Research and Methodology National Center for Health Statistics Centers for Disease Control and Prevention Presentation for discussion
More informationDMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support
DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support Rok Rupnik, Matjaž Kukar, Marko Bajec, Marjan Krisper University of Ljubljana, Faculty of Computer and Information
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More informationAbstract from the Journal of Alcohol and Clinical Experimental Research, 1987; 11 [5]: 416 23
I would like to state from the outset, that I have no concerns when it comes to questioning the efficacy of 12-step-based treatments in the treatment of addiction. However, I have great concern when the
More informationData Mining Report. DHS Privacy Office Response to House Report 108-774
Data Mining Report Response to House Report 108-774 Report to Congress on the Impact of Data Mining Technologies on Privacy and Civil Liberties Respectfully submitted Maureen Cooney Acting Chief Privacy
More informationHow To Help Your Business With Benefits
Myths and Misperceptions: What employee benefits can do for small businesses Brighter ideas in small business benefits Table of Contents Myths and Misperceptions: What Employee Benefits Can Do for Small
More informationThe Business Credit Index
The Business Credit Index April 8 Published by the Credit Management Research Centre, Leeds University Business School April 8 1 April 8 THE BUSINESS CREDIT INDEX During the last ten years the Credit Management
More informationDecision Theory. 36.1 Rational prospecting
36 Decision Theory Decision theory is trivial, apart from computational details (just like playing chess!). You have a choice of various actions, a. The world may be in one of many states x; which one
More informationExample G Cost of construction of nuclear power plants
1 Example G Cost of construction of nuclear power plants Description of data Table G.1 gives data, reproduced by permission of the Rand Corporation, from a report (Mooz, 1978) on 32 light water reactor
More informationLuciano Rispoli Department of Economics, Mathematics and Statistics Birkbeck College (University of London)
Luciano Rispoli Department of Economics, Mathematics and Statistics Birkbeck College (University of London) 1 Forecasting: definition Forecasting is the process of making statements about events whose
More informationThe human sex odds at birth after the atmospheric atomic bomb tests, after Chernobyl, and in the vicinity of nuclear facilities: Comment.
Torture numbers, and they'll confess to anything. Gregg Easterbrook The human sex odds at birth after the atmospheric atomic bomb tests, after Chernobyl, and in the vicinity of nuclear facilities: Comment.
More informationAmerican Statistical Association
American Statistical Association Promoting the Practice and Profession of Statistics ASA Statement on Using Value-Added Models for Educational Assessment April 8, 2014 Executive Summary Many states and
More informationBig Bang and Steady State Theories - Past exam questions (6 mark)
Big Bang and Steady State Theories - Past exam questions (6 mark) (1) * Scientists believe that the Universe is expanding. Describe how careful observation of electromagnetic radiation from distant galaxies
More informationBasic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 12: June 22, 2012. Abstract. Review session.
June 23, 2012 1 review session Basic Data Analysis Stephen Turnbull Business Administration and Public Policy Lecture 12: June 22, 2012 Review session. Abstract Quantitative methods in business Accounting
More informationNetwork Big Data: Facing and Tackling the Complexities Xiaolong Jin
Network Big Data: Facing and Tackling the Complexities Xiaolong Jin CAS Key Laboratory of Network Data Science & Technology Institute of Computing Technology Chinese Academy of Sciences (CAS) 2015-08-10
More informationIndiana Academic Standards Mathematics: Probability and Statistics
Indiana Academic Standards Mathematics: Probability and Statistics 1 I. Introduction The college and career ready Indiana Academic Standards for Mathematics: Probability and Statistics are the result of
More information10.1 Determining What the Client Needs. Determining What the Client Needs (contd) Determining What the Client Needs (contd)
Slide 10..1 CHAPTER 10 Slide 10..2 Object-Oriented and Classical Software Engineering REQUIREMENTS Seventh Edition, WCB/McGraw-Hill, 2007 Stephen R. Schach srs@vuse.vanderbilt.edu Overview Slide 10..3
More informationMaking data predictive why reactive just isn t enough
Making data predictive why reactive just isn t enough Andrew Peterson, Ph.D. Principal Data Scientist Soltius NZ, Ltd. New Zealand 2014 Big Data and Analytics Forum 18 August, 2014 Caveats and disclaimer:
More informationChicago Insurance Redlining - a complete example
Chapter 12 Chicago Insurance Redlining - a complete example In a study of insurance availability in Chicago, the U.S. Commission on Civil Rights attempted to examine charges by several community organizations
More informationGold. Mining for Information
Mining for Information Gold Data mining offers the RIM professional an opportunity to contribute to knowledge discovery in databases in a substantial way Joseph M. Firestone, Ph.D. During the late 1980s,
More informationChapter 20: Data Analysis
Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification
More informationProject Evaluation Guidelines
Project Evaluation Guidelines Queensland Treasury February 1997 For further information, please contact: Budget Division Queensland Treasury Executive Building 100 George Street Brisbane Qld 4000 or telephone
More informationTime Series Analysis: Basic Forecasting.
Time Series Analysis: Basic Forecasting. As published in Benchmarks RSS Matters, April 2015 http://web3.unt.edu/benchmarks/issues/2015/04/rss-matters Jon Starkweather, PhD 1 Jon Starkweather, PhD jonathan.starkweather@unt.edu
More information[This document contains corrections to a few typos that were found on the version available through the journal s web page]
Online supplement to Hayes, A. F., & Preacher, K. J. (2014). Statistical mediation analysis with a multicategorical independent variable. British Journal of Mathematical and Statistical Psychology, 67,
More informationBusiness Process Mining for Internal Fraud Risk Reduction: Results of a Case Study
Business Process Mining for Internal Fraud Risk Reduction: Results of a Case Study Mieke Jans, Nadine Lybaert, and Koen Vanhoof Hasselt University, Agoralaan Building D, 3590 Diepenbeek, Belgium http://www.uhasselt.be/
More informationFive Myths of Active Portfolio Management. P roponents of efficient markets argue that it is impossible
Five Myths of Active Portfolio Management Most active managers are skilled. Jonathan B. Berk 1 This research was supported by a grant from the National Science Foundation. 1 Jonathan B. Berk Haas School
More informationSummary. January 2013»» white paper
white paper A New Perspective on Small Business Growth with Scoring Understanding Scoring s Complementary Role and Value in Supporting Small Business Financing Decisions January 2013»» Summary In the ongoing
More informationText Analytics Beginner s Guide. Extracting Meaning from Unstructured Data
Text Analytics Beginner s Guide Extracting Meaning from Unstructured Data Contents Text Analytics 3 Use Cases 7 Terms 9 Trends 14 Scenario 15 Resources 24 2 2013 Angoss Software Corporation. All rights
More informationStatistical Fallacies: Lying to Ourselves and Others
Statistical Fallacies: Lying to Ourselves and Others "There are three kinds of lies: lies, damned lies, and statistics. Benjamin Disraeli +/- Benjamin Disraeli Introduction Statistics, assuming they ve
More informationTopic #6: Hypothesis. Usage
Topic #6: Hypothesis A hypothesis is a suggested explanation of a phenomenon or reasoned proposal suggesting a possible correlation between multiple phenomena. The term derives from the ancient Greek,
More informationAnalyzing survey text: a brief overview
IBM SPSS Text Analytics for Surveys Analyzing survey text: a brief overview Learn how gives you greater insight Contents 1 Introduction 2 The role of text in survey research 2 Approaches to text mining
More informationGerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I
Gerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I Data is Important because it: Helps in Corporate Aims Basis of Business Decisions Engineering Decisions Energy
More informationHealthcare Measurement Analysis Using Data mining Techniques
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik
More informationChapter 1: Health & Safety Management Systems (SMS) Leadership and Organisational Safety Culture
Chapter 1: Health & Safety Management Systems (SMS) Leadership and Organisational Safety Culture 3 29 Safety Matters! A Guide to Health & Safety at Work Chapter outline Leadership and Organisational Safety
More informationMarket Economies and the Price System
Market Economies and the Price System The Three Fundamental Economic Questions: WHAT is to be produced? HOW are these goods to be produced? FOR WHOM are the goods to be produced? Market Economies and the
More informationProactively monitoring emerging risks through the analysis of occurrence and investigation data: Techniques used by the Australian Investigator
Proactively monitoring emerging risks through the analysis of occurrence and investigation data: Techniques used by the Australian Investigator Stuart Godley Manager Research Investigations and Data Analysis
More informationArt or Science? Modeling and Challenges in the Post-Financial Crisis Economy
Art or Science? Modeling and Challenges in the Post-Financial Crisis Economy Emre Sahingur, Ph.D. Chief Risk Officer for Model Risk Fannie Mae May 2015 2011 Fannie Mae. Trademarks of Fannie Mae. 2015 Fannie
More informationMaking critical connections: predictive analytics in government
Making critical connections: predictive analytics in government Improve strategic and tactical decision-making Highlights: Support data-driven decisions using IBM SPSS Modeler Reduce fraud, waste and abuse
More informationTime series analysis as a framework for the characterization of waterborne disease outbreaks
Interdisciplinary Perspectives on Drinking Water Risk Assessment and Management (Proceedings of the Santiago (Chile) Symposium, September 1998). IAHS Publ. no. 260, 2000. 127 Time series analysis as a
More informationData Science and Prediction*
Data Science and Prediction* Vasant Dhar Professor Editor-in-Chief, Big Data Co-Director, Center for Business Analytics, NYU Stern Faculty, Center for Data Science, NYU *Article in Communications of the
More informationChallenger Retirement Income Research. How much super does a retiree really need to live comfortably? A comfortable standard of living
14 February 2012 Only for use by financial advisers How much super does a retiree really need to live comfortably? Understanding how much money will be needed is critical in planning for retirement One
More informationThe State of Data Security Intelligence. Sponsored by Informatica. Independently conducted by Ponemon Institute LLC Publication Date: April 2015
The State of Data Security Intelligence Sponsored by Informatica Independently conducted by Ponemon Institute LLC Publication Date: April 2015 Ponemon Institute Research Report The State of Data Security
More informationBIG Data. An Introductory Overview. IT & Business Management Solutions
BIG Data An Introductory Overview IT & Business Management Solutions What is Big Data? Having been a dominating industry buzzword for the past few years, there is no contesting that Big Data is attracting
More informationPredictive analytics. The rise and value of predictive analytics in enterprise decision making
WHITE PAPER Predictive analytics The rise and value of predictive analytics in enterprise decision making Give me a long enough lever and a place to stand, and I can move the Earth. Archimedes, 250 B.C.
More informationA better way to calculate equipment ROI
page 1 A better way to calculate equipment ROI a West Monroe Partners white paper by Aaron Lininger Copyright 2012 by CSCMP s Supply Chain Quarterly (www.supplychainquarterly.com), a division of Supply
More informationOpening Remarks. Chairwoman Edith Ramirez Federal Trade Commission
Welcome Opening Remarks Chairwoman Edith Ramirez Federal Trade Commission Presentation: Framing the Conversation Solon Barocas Princeton University Center for Information Technology Policy Big Data: A
More informationMAPPING DRUG OVERDOSE IN ADELAIDE
MAPPING DRUG OVERDOSE IN ADELAIDE Danielle Taylor GIS Specialist GISCA, The National Key Centre for Social Applications of GIS University of Adelaide Roslyn Clermont Corporate Information Officer SA Ambulance
More information