Tom Khabaza. Hard Hats for Data Miners: Myths and Pitfalls of Data Mining
|
|
- Jessica Wilkerson
- 8 years ago
- Views:
Transcription
1 Tom Khabaza Hard Hats for Data Miners: Myths and Pitfalls of Data Mining
2 Hard Hats for Data Miners: Myths and Pitfalls of Data Mining By Tom Khabaza The intrepid data miner runs many risks, including being buried under mountains of data. Some risks are just myths that need to be debunked. Others, however, are real. In this article, I will debunk several of these myths and misconceptions and then describe some problems and pitfalls commonly encountered when conducting data mining, along with steps that you can take to protect yourself from them. A critical point to note is that data mining is a business process-a way of finding patterns in your data that provide insight you can use to conduct your business more effectively. Data mining also makes predictions to guide customer interactions and other business decisions. You'll see these points reinforced numerous times in the information that follows. Myths and misconceptions about data mining Myth #1: Data mining is all about algorithms A businessperson attending a typical data mining conference or reading its proceedings might form the impression that data mining is all about advanced data analysis algorithms. This misconception might be summarized as follows: "All you need for data mining is good algorithms. The better your algorithms, the better your data mining; advancing the effectiveness of data mining means advancing our knowledge of algorithms." To hold this view is to misunderstand the data mining process. Data mining is a process consisting of many elements, such as formulating business goals, mapping business goals to data mining goals, acquiring, understanding, and pre-processing the data, evaluating and presenting the results of analysis and deploying these results to achieve business benefits. This is not to minimize the importance of new or improved data mining algorithms. The problem occurs when data miners focus too much on the algorithms and ignore the other percent of the data mining process. The consequences this misconception can be disastrous for a data mining project, possibly resulting in a failure to produce any useful results. Experienced data miners recognize the need for a broader view of the data mining process.
3 Myth #2: Data mining is all about predictive accuracy While data mining is not all about data analysis algorithms, there is a part of data mining that is about algorithms. This raises the question, "How can you judge the quality of an algorithm?" You might think that the main criterion would be the predictive accuracy of the models it generates. This view, however, misrepresents the role of algorithms in the data mining process. It is true that a predictive model should have some degree of accuracy, because this demonstrates that it has truly discovered patterns in the data. However, the usefulness of an algorithm or model is also determined by a number of other properties, one of which is whether understanding the resulting model requires deep technical knowledge or is something that can be understood by a typical analyst. Data miners who believe that predictive accuracy is the primary criterion of algorithm evaluation might use algorithms that can only be used by technology experts. These algorithms will then play only the most limited role, because data mining is a process that is driven by business expertise; it relies on the input and involvement of non-technical business professionals in order to be successful. Myth #3: Data mining requires a data warehouse Business people often think that a data warehouse is a prerequisite for data mining. This is a subtle misconception about the relationship between the two technologies. It is true that data mining can benefit from warehoused data that is well organized, relatively clean, and easy to access. This is particularly true if the warehouse has been constructed with data mining specifically in mind and with knowledge of the requirements of the data mining project. If this has not been the case, however, the warehoused data may be less useful for data mining than the source or operational data. In the worst case, warehoused data may be completely useless (for example, if only summary data are stored). A more accurate depiction of the relationship between the two would be that data mining benefits from a properly designed data warehouse; and that constructing such a warehouse often benefits from first doing some exploratory data mining. Myth #4: Data mining is all about vast quantities of data Early explanations of data mining often began with statements like, "We now collect more data than ever, yet how are we to benefit from these vast data stores?" Focusing on the size of data stores provided a convenient introduction to the topic of data mining, but subtly misrepresented its nature. While there are many large datasets that organizations can benefit from mining, it would be a mistake to believe that these should be the sole focus of data mining. Many useful data mining projects are performed on small or medium-sized datasets-some, for example, containing only a few hundreds or thousands of records. Subscribing to the erroneous belief that data mining is only appropriate for vast data stores would lead organizations to choose tools that sacrifice usability for scalability when, in fact, both attributes are essential. To quote a customer of a leading data mining tool: "Other data mining tools optimize machine time, but this tool optimizes my time." Whether the datasets are large or small, organizations should choose a data mining tool that optimizes the user's time.
4 Myth #5: Data mining should be done by a technology expert full million examples, or even 500,000. Consider the following questions and answers: Data mining uses advanced technology, and its workings, particularly those of modeling techniques, are unlikely to be understood by the wider IT community. Does this mean that data mining should be conducted only by those who understand every nuance of the technology that is involved? Quite the opposite is true, due to the paramount importance of business knowledge in data mining. When performed without business knowledge, data mining can produce nonsensical or useless results (see pitfall #3, below), so it is essential that data mining be performed by someone with extensive knowledge of the business problem. Very seldom is this the same person with extensive knowledge of the data mining technology. It is the responsibility of data mining tool providers to ensure that tools are accessible to business users. Pitfalls of data mining and how to avoid them Pitfall #1: Buried under mountains of data Data mining should be an interactive, iterative process in which the analyst applies substantial business knowledge and is "engaged" with the data. However, those who hold myth #4 (that data mining is about vast quantities of data) often suppose that this process must be applied to all of the available data. This can lead to attempts to mine volumes of data for which the available hardware and software cannot provide an acceptable interactive response. In these situations, the data mining process becomes sluggish, and by the time a question is answered, the analyst cannot remember why it was asked. The way to avoid this pitfall is to employ some form of sampling. For example, if we have a million customers and a 20 percent annual attrition (or "churn") rate, we need not plot our graphs or build our models using the Q: How many churn profiles do we expect to find? A: Maybe ten Q: How many examples of each profile do we need? A: Maybe a thousand Therefore, a sample of ten or twenty thousand churners and an equivalent number of non-churners is likely to be sufficient for this analysis. Note that this does not mean that data miners will never encounter the need to build models from millions of examples; only that they should not assume that they must do so, just because the data are available. Pitfall #2: The Mysterious Disappearing Terabyte This is a common phenomenon, but not always a pitfall. It refers to the fact that, for a given data mining problem, the amount of available and relevant data may be much less than initially supposed. Consider the following scenario: You are a data mining consultant, and your client is a large bank, which wishes to mine its customer data to determine credit risk. The bank holds terabytes of data on its customers and is concerned that the available computing resources may be inadequate to mine this volume of data. Here's how the situation might unfold. Different types of credit (personal loans, business loans, overdrafts) present different patterns of credit risk, so each data mining project will concentrate on just one type of borrower. The bank's domain experts judge a number of factors to be relevant, and the bank, planning ahead, began collecting data on these factors about 18 months ago. Since then, almost a thousand cases of bad debt have occurred. Thus, the relevant data consist of less than a thousand cases of bad debt plus a sample from a plentiful supply of cases of good debt-let's say 3,000 records in all. Somehow, the need to mine terabytes of data has disappeared "mysteriously".
5 Pitfall #3: Disorganized data mining Data mining can occasionally, despite the best of intentions, take place in an ad hoc manner, with no clear goals and no idea of how the results will be used. This leads to wasted time and unusable results. To produce useful results, it is critical to have clearly defined business and data mining goals, formulated early in the project, and clearly articulated deployment plans. A simple way of ensuring this is to use a standard process such as the CRoss-Industry Standard Practice for Data Mining (CRISP-DM) [1]. Such a process ensures the correct preparation for data mining and provides a common language for communicating methods and results. Data mining tools should support standard process models. Pitfall #4: Insufficient business knowledge surprisingly hard to come by. It might be that the data expert has left the organization or moved to another department or, in the case of legacy systems, there may be no data expert at all. This problem is exacerbated when the database or data warehouse management is outsourced: the external supplier is even less motivated than the user organization to maintain this information "just in case it might be needed in future." There is no simple resolution to this problem. IT departments should be made aware of the need to maintain information about their organization's databases. Also, when a data mining project is proposed, data miners should consider how much data knowledge is available and evaluate any risks caused by its absence or scarcity. Pitfall #6: Erroneous assumptions, courtesy of the experts On a number of occasions this article has mentioned the crucial role that business knowledge plays in data mining. Without it, organizations can neither achieve useful results nor guide the data mining process towards them. It is sometimes supposed that the end user can reasonably tell the data miner: "Here are the data, please go away, do your data mining, and come back with the answers." If this were to happen, the project would, at best, take many long and costly iterations to produce useful results. At worst, the results would be gibberish, and the project would fail. This pitfall can only be avoided by involving, at every stage of the data mining process, both the end user and someone with a detailed knowledge of the business. Ideally, the data miner or data mining consultant would have the business knowledge. Lacking it, the data miner should literally sit next to someone with the required business knowledge who understands the question under consideration. For this to work effectively, a highly interactive data mining environment with good response time is required. Pitfall #5: Insufficient data knowledge In order to perform data mining, we must be able to answer questions like "What do the codes in this field mean?" and "Can there be more than one record per customer in this table?". In some cases, this information is Business and data experts are crucial resources, but this does not mean that the data miner should unquestioningly accept every statement they make. The data miner should seek to confirm the validity of experts' statements. Typical examples of erroneous or misleading statements might include: No customer can hold accounts of both these types No case will include more than one event of this type Only the following codes will be present in this field Data miners should verify statements like these by examining the data. This is particularly important when processing of the data will depend on their accuracy. Ideally, mistakes in assumptions about data can be spotted before they lead to errors in the treatment of data. Data mining tools should make this easy to accomplish. Pitfall #7: Incompatibility of data mining tools The data mining process requires a wide range of capabilities, so it's not unusual that during a single project a wide variety of tools might be used. This can, however, lead to high overhead costs due to the time and resources required to switch contexts and convert data from one format to another. At its worst, this can lead to the omission of necessary steps in the data mining process and can seriously interfere with the exploratory character of data mining.
6 The best solution is to use a data mining toolkit that integrates all the required capabilities. However, no toolkit will provide every possible capability, especially when the individual preferences of analysts are taken into account, so the toolkit should also be "open"-that is, able to interface easily with other available tools and third-party options. Pitfall #8: Locked in the data jail-house In addition to openness with regard to tools, data mining solutions should also be open with regard to data. Some data mining tools require the data to be held in a proprietary format that is not compatible with commonly used database systems. (This is sometimes referred to as the "data jail-house.") This can result in high overhead costs, due to the need for transferring data into the required format, and lead to difficulty in deploying the results into an organization's operational systems. A good data mining tool will interface with your data via common standards. Conclusion Data mining is a business process, requiring extensive business knowledge. It is best practiced by business experts or by data mining experts in close collaboration with business experts. Data mining uses a variety of techniques and should not focus only on modeling algorithms and their predictive accuracy. Each technique can play a variety of roles. During the data mining process, data miners interact and engage with the data in an iterative fashion. A standard data mining process model, such as CRISP-DM [1], helps to ensure the correct preparation for and use of data mining. Data mining tools should be evaluated based on their accessibility to business users, their scalability and usability, and their support for standard processes. Data miners should make intelligent decisions about the amount of data required, assuming neither that all of an organization's data will be relevant nor that all the available data will be required. Effective data mining requires flexible and interoperable techniques. This requirement is best met by integrated, open toolkits that can interface to data by means of open standards. References [1] Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., and Wirth, R. CRISP-DM 1.0 Step-by-step data mining guide, CRISP-DM Consortium, 2000, available at Weitere Information über SPSS erhalten Sie unter SPSS Schweiz AG, Schneckenmannstrasse 25, 8044 Zürich Telefon +41 (0) , Fax +41 (0) SPSS is a registered trademark and the other SPSS products named are trademarks of SPSS Inc. All other names are trademarks of their respective owners SPSS Inc. All rights reserved. DamiD/0404
Hard hats for data miners: Myths and pitfalls of data mining
Hard hats for data miners: Myths and pitfalls of data mining T. Khabaza SPSS Advanced Data Mining Group Abstract The intrepid data miner runs many risks, such as being buried under mountains of data or
More informationCRISP - DM. Data Mining Process. Process Standardization. Why Should There be a Standard Process? Cross-Industry Standard Process for Data Mining
Mining Process CRISP - DM Cross-Industry Standard Process for Mining (CRISP-DM) European Community funded effort to develop framework for data mining tasks Goals: Cross-Industry Standard Process for Mining
More informationPlanning successful data mining projects
IBM SPSS Modeler Planning successful data mining projects A practical, three-step guide to planning your first data mining project and selling it internally Contents: 1 Executive summary 2 One: Start with
More informationCRISP-DM, which stands for Cross-Industry Standard Process for Data Mining, is an industry-proven way to guide your data mining efforts.
CRISP-DM, which stands for Cross-Industry Standard Process for Data Mining, is an industry-proven way to guide your data mining efforts. As a methodology, it includes descriptions of the typical phases
More informationUsing Data Mining to Detect Insurance Fraud
IBM SPSS Modeler Using Data Mining to Detect Insurance Fraud Improve accuracy and minimize loss Highlights: combines powerful analytical techniques with existing fraud detection and prevention efforts
More informationThe top 10 secrets to using data mining to succeed at CRM
The top 10 secrets to using data mining to succeed at CRM Discover proven strategies and best practices Highlights: Plan and execute successful data mining projects using IBM SPSS Modeler. Understand the
More informationUsing Data Mining to Detect Insurance Fraud
IBM SPSS Modeler Using Data Mining to Detect Insurance Fraud Improve accuracy and minimize loss Highlights: Combine powerful analytical techniques with existing fraud detection and prevention efforts Build
More informationA STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant
More informationSolve your toughest challenges with data mining
IBM Software IBM SPSS Modeler Solve your toughest challenges with data mining Use predictive intelligence to make good decisions faster Solve your toughest challenges with data mining Imagine if you could
More informationThe Top 10 Secrets to Using Data Mining to Succeed at CRM
The Top 10 Secrets to Using Data Mining to Succeed at CRM Discover proven strategies and best practices Highlights: Plan and execute successful data mining projects. Understand the roles and responsibilities
More informationSolve Your Toughest Challenges with Data Mining
IBM Software Business Analytics IBM SPSS Modeler Solve Your Toughest Challenges with Data Mining Use predictive intelligence to make good decisions faster Solve Your Toughest Challenges with Data Mining
More informationIBM SPSS Modeler Professional
IBM SPSS Modeler Professional Make better decisions through predictive intelligence Highlights Create more effective strategies by evaluating trends and likely outcomes. Easily access, prepare and model
More informationStep-by-step data mining guide
Step-by-step data mining guide Pete Chapman (NCR), Julian Clinton (SPSS), Randy Kerber (NCR), Thomas Khabaza (SPSS), Thomas Reinartz (DaimlerChrysler), Colin Shearer (SPSS) and Rüdiger Wirth (DaimlerChrysler)
More informationCRISP-DM 1.0. Step-by-step data mining guide
CRISP-DM 1.0 Step-by-step data mining guide Pete Chapman (NCR), Julian Clinton (SPSS), Randy Kerber (NCR), Thomas Khabaza (SPSS), Thomas Reinartz (DaimlerChrysler), Colin Shearer (SPSS) and Rüdiger Wirth
More informationFrom Cognitive Science to Data Mining: The first intelligence amplifier
From Cognitive Science to Data Mining: The first intelligence amplifier Tom Khabaza Abstract This paper gives a brief account of two hypotheses. First that data mining is a kind of intelligence amplifier,
More informationHow to Choose a Social Media Monitoring and Review Analytics Tool. Make sure the greatest possible range of data is indexed
How to Choose a Social Media Monitoring and Review Analytics Tool by Josiah Mackenzie, ReviewPro Over the past year, a lot has changed in the hotel reputation management industry. And these changes require
More informationSuccessful Outsourcing of Data Warehouse Support
Experience the commitment viewpoint Successful Outsourcing of Data Warehouse Support Focus IT management on the big picture, improve business value and reduce the cost of data Data warehouses can help
More informationIBM SPSS Modeler Premium
IBM SPSS Modeler Premium Improve model accuracy with structured and unstructured data, entity analytics and social network analysis Highlights Solve business problems faster with analytical techniques
More informationAn Introduction to Advanced Analytics and Data Mining
An Introduction to Advanced Analytics and Data Mining Dr Barry Leventhal Henry Stewart Briefing on Marketing Analytics 19 th November 2010 Agenda What are Advanced Analytics and Data Mining? The toolkit
More informationLluis Belanche + Alfredo Vellido. Intelligent Data Analysis and Data Mining
Lluis Belanche + Alfredo Vellido Intelligent Data Analysis and Data Mining a.k.a. Data Mining II Office 319, Omega, BCN EET, office 107, TR 2, Terrassa avellido@lsi.upc.edu skype, gtalk: avellido Tels.:
More informationTesting, What is it Good For? Absolutely Everything!
Testing, What is it Good For? Absolutely Everything! An overview of software testing and why it s an essential step in building a good product Beth Schechner Elementool The content of this ebook is provided
More informationJunk Research Pandemic in B2B Marketing: Skepticism Warranted When Evaluating Market Research Methods and Partners By Bret Starr
Junk Research Pandemic in BB Marketing: Skepticism Warranted When Evaluating Market Research Methods and Partners By Bret Starr Junk Research Pandemic in BB Marketing: Skepticism Warranted When Evaluating
More informationCRISP-DM: Towards a Standard Process Model for Data Mining
CRISP-DM: Towards a Standard Process Model for Mining Rüdiger Wirth DaimlerChrysler Research & Technology FT3/KL PO BOX 2360 89013 Ulm, Germany ruediger.wirth@daimlerchrysler.com Jochen Hipp Wilhelm-Schickard-Institute,
More informationThree proven methods to achieve a higher ROI from data mining
IBM SPSS Modeler Three proven methods to achieve a higher ROI from data mining Take your business results to the next level Highlights: Incorporate additional types of data in your predictive models By
More informationhmetrix Revolutionizing Healthcare Analytics with Vertica & Tableau
Powered by Vertica Solution Series in conjunction with: hmetrix Revolutionizing Healthcare Analytics with Vertica & Tableau The cost of healthcare in the US continues to escalate. Consumers, employers,
More informationMIPRO s Business Intelligence Manifesto: Six Requirements for an Effective BI Deployment
MIPRO s Business Intelligence Manifesto: Six Requirements for an Effective BI Deployment Contents Executive Summary Requirement #1: Execute Dashboards Effectively Requirement #2: Understand the BI Maturity
More informationEleven Steps to Success in Data Warehousing
A P P L I C A T I O N S A WHITE PAPER SERIES BUILDING A DATA WAREHOUSE IS NO EASY TASK... THE RIGHT PEOPLE, METHODOLOGY, AND EXPERIENCE ARE EXTREMELY CRITICAL Eleven Steps to Success in Data Warehousing
More informationThe Big Data Deluge: Creating Serious Business Problems. Analytics: Harnessing Big Data Deluge to Acquire Business Power
The Big Data Deluge: Creating Serious Business Problems Analytics: Harnessing Big Data Deluge to Acquire Business Power Predictive Analytics: The Holy Grail of Big Data Analytics The Predictive Analytics
More informationWhy Data Mining Research Does Not Contribute to Business?
Why Data Mining Research Does Not Contribute to Business? Mykola Pechenizkiy 1, Seppo Puuronen 1, Alexey Tsymbal 2 1 Dept. of Computer Science and Inf. Systems, University of Jyväskylä, Finland {mpechen,sepi}@cs.jyu.fi
More informationCS590D: Data Mining Chris Clifton
CS590D: Data Mining Chris Clifton March 10, 2004 Data Mining Process Reminder: Midterm tonight, 19:00-20:30, CS G066. Open book/notes. Thanks to Laura Squier, SPSS for some of the material used How to
More informationData Project Extract Big Data Analytics course. Toulouse Business School London 2015
Data Project Extract Big Data Analytics course Toulouse Business School London 2015 How do you analyse data? Project are often a flop: Need a problem, a business problem to solve. Start with a small well-defined
More informationIs Cloud ERP Really Cheaper?
Is Cloud ERP Really Cheaper? A Simple Guide to Understanding the Differences Between Cloud and On- Premise Distribution Software This guide attempts to outline all of the principal considerations that
More informationThe Power of Business Intelligence in the Revenue Cycle
The Power of Business Intelligence in the Revenue Cycle Increasing Cash Flow with Actionable Information John Garcia August 4, 2011 Table of Contents Revenue Cycle Challenges... 3 The Goal of Business
More informationnot possible or was possible at a high cost for collecting the data.
Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day
More informationBIG DATA IS MESSY PARTNER WITH SCALABLE
BIG DATA IS MESSY PARTNER WITH SCALABLE SCALABLE SYSTEMS HADOOP SOLUTION WHAT IS BIG DATA? Each day human beings create 2.5 quintillion bytes of data. In the last two years alone over 90% of the data on
More informationTop Seven Things Servicemembers Do To RUIN Their Credit (And What They Can Do To Prevent It).
Top Seven Things Servicemembers Do To RUIN Their Credit (And What They Can Do To Prevent It). By Peter G. Bielagus The Go To Guy For Young People and Their Money. www.peterbspeaks.com 1 Top Seven Things
More informationHow Leverage Really Works Against You
Forex Trading: How Leverage Really Works Against You By: Hillel Fuld Reviewed and recommended by Rita Lasker 2012 Introduction: The Forex market is an ideal trading arena for making serious profits. However,
More informationDigging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA
Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA ABSTRACT Current trends in data mining allow the business community to take advantage of
More informationWhite Paper. Data Quality: Improving the Value of Your Data
White Paper Data Quality: Improving the Value of Your Data This document contains Confidential, Proprietary and Trade Secret Information ( Confidential Information ) of Informatica Corporation and may
More informationCHAPTER 1 INTRODUCTION
CHAPTER 1 INTRODUCTION 1. Introduction 1.1 Data Warehouse In the 1990's as organizations of scale began to need more timely data for their business, they found that traditional information systems technology
More informationWhite Paper. Self-Service Business Intelligence and Analytics: The New Competitive Advantage for Midsize Businesses
White Paper Self-Service Business Intelligence and Analytics: The New Competitive Advantage for Midsize Businesses Contents Forward-Looking Decision Support... 1 Self-Service Analytics in Action... 1 Barriers
More informationThe Analysis of Quality Escapes in the Aerospace & Defense Industry
The Analysis of Quality Escapes in the Aerospace & Defense Industry White Paper November 1, 2012 1825 Commerce Center Blvd Fairborn, Ohio 45324 937-322-3227 www.ren-rervices.com The Analysis of Quality
More informationCFSD 21 ST CENTURY SKILL RUBRIC CRITICAL & CREATIVE THINKING
Critical and creative thinking (higher order thinking) refer to a set of cognitive skills or strategies that increases the probability of a desired outcome. In an information- rich society, the quality
More informationStart-up Companies Predictive Models Analysis. Boyan Yankov, Kaloyan Haralampiev, Petko Ruskov
Start-up Companies Predictive Models Analysis Boyan Yankov, Kaloyan Haralampiev, Petko Ruskov Abstract: A quantitative research is performed to derive a model for predicting the success of Bulgarian start-up
More informationAnalyzing the Customer Experience. With Q-Flow and SSAS
Q.nomy Analyzing the Customer Experience With Q-Flow and SSAS Using Microsoft SQL Server Analysis Service to analyze Q-Flow data, and to gain an insight of customer experience. July, 2012 Analyzing the
More informationAD INSERTION STORAGE REQUIREMENTS AND CACHING WHITE PAPER
AD INSERTION STORAGE REQUIREMENTS AND CACHING WHITE PAPER TABLE OF CONTENTS Introduction... 3 Ad Storage storage capacity limits, preload bandwidth, and caching... 3 Ad-spot lifetime... 4 Convenience of
More information& ENTERPRISE DATA COST AND SCALE WAREHOUSE AUGMENTATION BIG DATA COST, SCALABILITY
COST AND SCALE BIG DATA COST, SCALABILITY & ENTERPRISE DATA 1 WAREHOUSE AUGMENTATION To derive the most value from Big Data technologies, enterprises must solve the cost and scalability problems inherent
More informationMETA DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING Ramesh Babu Palepu 1, Dr K V Sambasiva Rao 2 Dept of IT, Amrita Sai Institute of Science & Technology 1 MVR College of Engineering 2 asistithod@gmail.com
More informationIBM SPSS Data Mining Tips
IBM SPSS Data Mining Tips A handy guide to help you save handy guide to help you save time and money as you plan and time and money as you plan and execute your data mining projects execute your data mining
More informationThe Top 9 Ways to Increase Your Customer Loyalty
Follow these and enjoy an immediate lift in the loyalty of your customers By Kyle LaMalfa Loyalty Expert and Allegiance Best Practices Manager What is the Key to Business Success? Every company executive
More informationData Mining: An Introduction
Data Mining: An Introduction Michael J. A. Berry and Gordon A. Linoff. Data Mining Techniques for Marketing, Sales and Customer Support, 2nd Edition, 2004 Data mining What promotions should be targeted
More informationThe Analytics COE: the key to Monetizing Big Data via Predictive Analytics
www.hcltech.com The Analytics COE: the key to Monetizing Big Data via Predictive Analytics big data & business analytics AuthOr: Doug Freud Director, Data Science WHITEPAPER AUGUST 2014 In early 2012 Ann
More informationanalytics+insights for life science Descriptive to Prescriptive Accelerating Business Insights with Data Analytics a lifescale leadership brief
analytics+insights for life science Descriptive to Prescriptive Accelerating Business Insights with Data Analytics a lifescale leadership brief The potential of data analytics can be confusing for many
More informationA Review of Data Mining Techniques
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
More informationBetter decision making under uncertain conditions using Monte Carlo Simulation
IBM Software Business Analytics IBM SPSS Statistics Better decision making under uncertain conditions using Monte Carlo Simulation Monte Carlo simulation and risk analysis techniques in IBM SPSS Statistics
More informationPrescriptive Analytics. A business guide
Prescriptive Analytics A business guide May 2014 Contents 3 The Business Value of Prescriptive Analytics 4 What is Prescriptive Analytics? 6 Prescriptive Analytics Methods 7 Integration 8 Business Applications
More informationAnalytics For Everyone - Even You
White Paper Analytics For Everyone - Even You Abstract Analytics have matured considerably in recent years, to the point that business intelligence tools are now widely accessible outside the boardroom
More informationIBM Cognos TM1 Enterprise Planning, Budgeting and Analytics
Data Sheet IBM Cognos TM1 Enterprise Planning, Budgeting and Analytics Overview Highlights Reduces planning cycles by 75% and reporting from days to minutes Owned and managed by Finance and lines of business
More informationSPSS Data Mining Tips
SPSS Data Mining Tips A handy guide to help you save time and money as you plan and execute your data mining projects www.spss.com Table of contents Introduction...........................2 What is data
More informationWHITE PAPER. Unified Monitoring Drives High- Performance Business Results
WHITE PAPER Unified Monitoring Drives High- Performance Business Results Table of Contents EXEC SUMMARY... 1 INTRODUCTION... 1 THINK BEFORE YOU BUY... 2 The Pitfalls of Silos...2 Monitoring Tools: Less
More informationPredictive Analytics for Retail: Understanding Customer Behaviour
Predictive Analytics for Retail: Understanding Customer Behaviour Jarlath Quinn Analytics Consultant Rachel Clinton Business Development www.sv-europe.com FAQ s Is this session being recorded? No Can I
More informationHow To Measure Quality
Introduction Metrics for Software Testing: Managing with Facts Part 4: Product Metrics In the previous article in this series, we moved from a discussion of process metrics to a discussion of how metrics
More informationThe Unfortunate Little Secret About Current CRM Data Cleansing. (And how it destroys your bottom line.)
The Unfortunate Little Secret About Current CRM Data Cleansing. (And how it destroys your bottom line.) Until now clean data was more myth than fact. That s because there is a crucial difference between
More information5 Ways To Avoid Cash Flow Problems In Your Business
5 Ways To Avoid Cash Flow Problems In Your Business The old maxim revenue is vanity, cash is reality is easy to forget in the whirlwind life of a business owner especially when things are going well. Unfortunately
More informationDatabase Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
More informationIBM Global Business Services Microsoft Dynamics CRM solutions from IBM
IBM Global Business Services Microsoft Dynamics CRM solutions from IBM Power your productivity 2 Microsoft Dynamics CRM solutions from IBM Highlights Win more deals by spending more time on selling and
More informationSTATISTICA. Financial Institutions. Case Study: Credit Scoring. and
Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT
More informationC A S E S T UDY The Path Toward Pervasive Business Intelligence at an Asian Telecommunication Services Provider
C A S E S T UDY The Path Toward Pervasive Business Intelligence at an Asian Telecommunication Services Provider Sponsored by: Tata Consultancy Services November 2008 SUMMARY Global Headquarters: 5 Speen
More informationBetter planning and forecasting with IBM Predictive Analytics
IBM Software Business Analytics SPSS Predictive Analytics Better planning and forecasting with IBM Predictive Analytics Using IBM Cognos TM1 with IBM SPSS Predictive Analytics to build better plans and
More informationDATA VISUALIZATION: When Data Speaks Business PRODUCT ANALYSIS REPORT IBM COGNOS BUSINESS INTELLIGENCE. Technology Evaluation Centers
PRODUCT ANALYSIS REPORT IBM COGNOS BUSINESS INTELLIGENCE DATA VISUALIZATION: When Data Speaks Business Jorge García, TEC Senior BI and Data Management Analyst Technology Evaluation Centers Contents About
More informationA Hurwitz white paper. Inventing the Future. Judith Hurwitz President and CEO. Sponsored by Hitachi
Judith Hurwitz President and CEO Sponsored by Hitachi Introduction Only a few years ago, the greatest concern for businesses was being able to link traditional IT with the requirements of business units.
More informationFrequency Matters. The keys to optimizing email send frequency
The keys to optimizing email send frequency Email send frequency requires a delicate balance. Send too little and you miss out on sales opportunities and end up leaving money on the table. Send too much
More informationData Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC
Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep Neil Raden Hired Brains Research, LLC Traditionally, the job of gathering and integrating data for analytics fell on data warehouses.
More informationFour Things You Must Do Before Migrating Archive Data to the Cloud
Four Things You Must Do Before Migrating Archive Data to the Cloud The amount of archive data that organizations are retaining has expanded rapidly in the last ten years. Since the 2006 amended Federal
More informationHexaware E-book on Predictive Analytics
Hexaware E-book on Predictive Analytics Business Intelligence & Analytics Actionable Intelligence Enabled Published on : Feb 7, 2012 Hexaware E-book on Predictive Analytics What is Data mining? Data mining,
More informationwww.moxonsolutions.com
www.moxonsolutions.com Introduction Moxon Intelligence Systems is a specialist predictive analytics development company. We focus on delivering software, consulting and training solutions that enable the
More informationIS YOUR DATA WAREHOUSE SUCCESSFUL? Developing a Data Warehouse Process that responds to the needs of the Enterprise.
IS YOUR DATA WAREHOUSE SUCCESSFUL? Developing a Data Warehouse Process that responds to the needs of the Enterprise. Peter R. Welbrock Smith-Hanley Consulting Group Philadelphia, PA ABSTRACT Developing
More informationAdvanced Big Data Analytics with R and Hadoop
REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional
More informationIndustry models for insurance. The IBM Insurance Application Architecture: A blueprint for success
Industry models for insurance The IBM Insurance Application Architecture: A blueprint for success Executive summary An ongoing transfer of financial responsibility to end customers has created a whole
More informationCon-way Freight. Leveraging best-of-breed business intelligence for customer satisfaction. Overview. Before: a company with a vision
Con-way Freight Leveraging best-of-breed business intelligence for customer satisfaction Overview The need To analyze transaction-level details on an ad hoc basis to optimize efficiencies based on outlier
More informationOptimizing Enrollment Management with Predictive Modeling
Optimizing Enrollment Management with Predictive Modeling Tips and Strategies for Getting Started with Predictive Analytics in Higher Education an ebook presented by Optimizing Enrollment with Predictive
More informationIncreasing marketing campaign profitability with predictive analytics
Executive report Increasing marketing campaign profitability with predictive analytics Table of contents Introduction..............................................................2 Focusing on the customer
More informationCreating an Effective Mystery Shopping Program Best Practices
Creating an Effective Mystery Shopping Program Best Practices BEST PRACTICE GUIDE Congratulations! If you are reading this paper, it s likely that you are seriously considering implementing a mystery shop
More informationWHITEPAPER. Creating and Deploying Predictive Strategies that Drive Customer Value in Marketing, Sales and Risk
WHITEPAPER Creating and Deploying Predictive Strategies that Drive Customer Value in Marketing, Sales and Risk Overview Angoss is helping its clients achieve significant revenue growth and measurable return
More informationRequirements Elicitation in Data Mining for Business Intelligence Projects
Requirements Elicitation in Data Mining for Business Intelligence Projects Paola Britos 1, Oscar Dieste 2 and Ramón García-Martínez 3 1 Software and Knowledge Engineering Center. Buenos Aires Institute
More informationPart II Management Accounting Decision-Making Tools
Part II Management Accounting Decision-Making Tools Chapter 7 Chapter 8 Chapter 9 Cost-Volume-Profit Analysis Comprehensive Business Budgeting Incremental Analysis and Decision-making Costs Chapter 10
More informationPredicting Churn. A SAS White Paper
A SAS White Paper Table of Contents Introduction......................................................................... 1 The Price of Churn...................................................................
More informationDeciding whether to purchase a tool or develop it in-house. by Elisabeth Hendrickson
Tools & Automation QUICK LOOK BuildIt Dispelling the myths surrounding both approaches Weighing your options or? BuyIt Deciding whether to purchase a tool or develop it in-house 32 You ve discovered that
More informationData Mining Applications in Higher Education
Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2
More informationBusiness Case for Smart Care Software Product Portfolio
Business Case for Smart Care Software Product Portfolio Contents Company Overview... 3 Growing Challenges with Mobile Device Support... 3 Solution... 4 Privacy and Security... 6 Financial Benefits... 7
More informationExploratory Testing Dynamics
Exploratory Testing Dynamics Created by James Bach, Jonathan Bach, and Michael Bolton 1 v2.2 Copyright 2005-2009, Satisfice, Inc. Exploratory testing is the opposite of scripted testing. Both scripted
More informationData Mining with Microsoft SQL Server 2005
International DSI / Asia and Pacific DSI 2007 Full Paper (July, 2007) Data Mining with Microsoft SQL Server 2005 Henning Stolz 1), Peter Lehmann 1),Waranya Poonnawat 3) 1) Institute for Business Intelligence,
More informationUSING DATA MINING FOR BANK DIRECT MARKETING: AN APPLICATION OF THE CRISP-DM METHODOLOGY
USING DATA MINING FOR BANK DIRECT MARKETING: AN APPLICATION OF THE CRISP-DM METHODOLOGY Sérgio Moro and Raul M. S. Laureano Instituto Universitário de Lisboa (ISCTE IUL) Av.ª das Forças Armadas 1649-026
More informationRetail s Complexity: The Information Technology Solution
A P P L I C A T I O N S A WHITE PAPER SERIES COMPLEXITY OF PRODUCTS, SCALE AND PROCESSES, ALONG WITH SUPPLY CHAIN CHALLENGES, PLACE EVER GREATER DEMANDS ON RETAILERS. IT SYSTEMS ARE AT THE HEART OF RETAIL
More informationThe Role of Knowledge Based Systems to Enhance User Participation in the System Development Process
The Role of Knowledge Based Systems to Enhance User Participation in the System Development Process Gian M Medri, PKBanken, Stockholm Summary: Computers are a fact of life today, even for the public in
More informationDATA MINING AND CRM IN TELECOMMUNICATIONS
www.sjm.tf.bor.ac.yu Serbian Journal of Management 3 (1) (2008) 61-72 Serbian Journal of Management Abstract DATA MINING AND CRM IN TELECOMMUNICATIONS D. Ćamilović* BK Faculty of Management, Palmira Toljatija
More informationAn Enterprise Framework for Business Intelligence
An Enterprise Framework for Business Intelligence Colin White BI Research May 2009 Sponsored by Oracle Corporation TABLE OF CONTENTS AN ENTERPRISE FRAMEWORK FOR BUSINESS INTELLIGENCE 1 THE BI PROCESSING
More informationDatabase Marketing simplified through Data Mining
Database Marketing simplified through Data Mining Author*: Dr. Ing. Arnfried Ossen, Head of the Data Mining/Marketing Analysis Competence Center, Private Banking Division, Deutsche Bank, Frankfurt, Germany
More informationData Quality Assessment. Approach
Approach Prepared By: Sanjay Seth Data Quality Assessment Approach-Review.doc Page 1 of 15 Introduction Data quality is crucial to the success of Business Intelligence initiatives. Unless data in source
More informationNavigating Big Data business analytics
mwd a d v i s o r s Navigating Big Data business analytics Helena Schwenk A special report prepared for Actuate May 2013 This report is the third in a series and focuses principally on explaining what
More information