# Data Mining for Business Intelligence. Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner. 2nd Edition

2 1.1 What Is Data Mining? Where Is Data Mining Used? Origins of Data Mining Rapid Growth of Data Mining Why Are There So Many Different Methods? Terminology and Notation Road Maps to This Book 9 Chapter 2 Overview of the Data Mining Process Introduction Core Ideas in Data Mining Supervised and Unsupervised Learning Steps in Data Mining Preliminary Steps Building a Model: Example with Linear Regression Using Excel for Data Mining 34 Part II DATA EXPLORATION AND DIMENSION REDUCTION Chapter 3 Data Visualization Uses of Data Visualization Data Examples Basic Charts: Bar Charts, Line Graphs, and Scatterplots Multidimensional Visualization Specialized Visualizations Summary ofmajor Visualizations and Operations, According to Data Mining Goal 67 Chapter 4 Dimension Reduction Introduction Practical Considerations Data Summaries Correlation Analysis Reducing the Number of Categories in Categorical Variables Converting a Categorical Variable to a Numerical Variable Principal Components Analysis 78

3 4.8 Dimension Reduction Using Regression Models Dimension Reduction Using Classification and Regression Trees 88 Part III PERFORMANCE EVALUATION Chapter 5 Evaluating Classification and Predictive Performance Introduction Judging Classification Performance Evaluating Predictive Performance 115 Part IV PREDICTION AND CLASSIFICATION METHODS Chapter 6 Multiple Linear Regression Introduction Explanatory versus Predictive Modeling Estimating the Regression Equation and Prediction Variable Selection in Linear Regression 127 Chapter 7 k-nearest Neighbors (k-nn) k-nn Classifier (Categorical Outcome) k-nn for a Numerical Response Advantages and Shortcomings of k-nn Algorithms 144 Chapter 8 Naive Bayes Introduction Applying the Full (Exact) Bayesian Classifier Advantages and Shortcomings of the Naive Bayes Classifier 159 Chapter 9 Classification and Regression Trees Introduction Classification Trees Measures of Impurity Evaluating the Performance of a Classification Tree Avoiding Overfitting Classification Rules from Trees Classification Trees for More Than Two Classes RegressionTrees Advantages, Weaknesses, and Extensions 187 Chapter 10 Logistic Regression 192

4 10.1 Introduction Logistic Regression Model Evaluating Classification Performance Example of Complete Analysis: Predicting Delayed Flights Appendix: Logistic Regression for Profiling 211 Chapter 11 Neural Nets Introduction Concept and Structure of a Neural Network Fitting a Network to Data Required User Input Exploring the Relationship Between Predictors andresponse Advantages and Weaknesses of Neural Networks 239 Chapter 12 Discriminant Analysis Introduction Distance of an Observation from a Class Fisher s Linear Classification Functions Classification Performance of Discriminant Analysis Prior Probabilities Unequal Misclassification Costs Classifying More Than Two Classes Advantages and Weaknesses 254 Part V MINING RELATIONSHIPS AMONG RECORDS Chapter 13 Association Rules Introduction Discovering Association Rules in Transaction Databases Generating Candidate Rules Selecting Strong Rules Summary 275 Chapter 14 Cluster Analysis Introduction Measuring Distance Between Two Records 283

5 14.3 Measuring Distance Between Two Clusters Hierarchical (Agglomerative) Clustering Nonhierarchical Clustering: The k-means Algorithm 295 Part VI FORECASTING TIME SERIES Chapter 15 Handling Time Series Introduction Explanatory versus Predictive Modeling Popular Forecasting Methods in Business Time Series Components Data Partitioning 312 Chapter 16 Regression-Based Forecasting Model with Trend Model with Seasonality Model with Trend and Seasonality Autocorrelation and ARIMA Models 324 Chapter 17 Smoothing Methods Introduction MovingAverage Simple Exponential Smoothing Advanced Exponential Smoothing 353 Part VII CASES Chapter 18 Cases Charles Book Club German Credit Tayko Software Cataloger Segmenting Consumers of Bath Soap Direct-MailFundraising Catalog Cross Selling Predicting Bankruptcy Time Series Case: Forecasting Public Transportation Demand 393 References 397 Index 399

