Rethinking Museum Visitors: Using K-means Cluster Analysis to Explore a Museum s Audience
|
|
|
- Hugo Martin
- 10 years ago
- Views:
Transcription
1 Curator Rethinking Museum Visitors: Using K-means Cluster Analysis to Explore a Museum s Audience Amanda Krantz, Randi Korn, and Margaret Menninger Abstract Understanding visitors is a necessary and complex undertaking. In this article, we present K-means cluster analysis as one strategy that is particularly useful in unpacking the complex nature of museum visitors. Three questions organize the article and are as follows: 1) What is K-means cluster analysis? 2) How is K-means cluster analysis conducted? 3) Most importantly: What are the applications of K-means cluster analysis for museum practitioners? To answer these questions, we present five steps that are vital to conducting a K-means cluster analysis. We also present three cases studies to demonstrate differences among the results of three K-means cluster analyses and provide practical applications of the findings. To help museums successfully achieve the impact they intend, museum practitioners, evaluators, and researchers need to know about visitors including their attitudes, preferences, and previous experiences. While qualitative methodologies are often used to provide understanding of museum visitors, we find survey research and quantitative analysis to be equally insightful. In particular, we are using K-means cluster analysis, an exploratory statistical procedure that creates groups of like visitors from interval level data from a given data set, thereby providing a more descriptive understanding of visitors in a museum context. 1 Our objective in writing this article is to help readers understand K-means cluster analysis and its application. To that end, we have intentionally avoided overly technical terms and have structured the article around three questions: 1) What is K-means cluster Amanda Krantz ([email protected]) is a research associate, Randi Korn (korn@randikorn. com) is the founding director, and Margaret Menninger ([email protected]) is a statistician at Randi Korn and Associates, Inc. 363
2 364 krantz et al. exploring a museum s audience analysis? 2) How is K-means cluster analysis conducted? 3) Most importantly: What are the uses of cluster analysis findings for museum practitioners? Throughout the article, we draw from our experience using K-means cluster analysis in studies for the Dallas Museum of Art, the Sports Legends Museum at Camden Yards, and the San Francisco Museum of Modern Art (Randi Korn and Associates, Inc. 2005; 2008; 2009). We use examples from these studies. 2 What is K-means Cluster Analysis? There are several types of cluster analyses or clustering. 3 Regardless of the type, all clustering aims to make natural groups out of the given data. Natural is defined as fitting and of practical use. In creating natural clusters, cluster analysis aims to form internally similar groups that differ from each other in distinct and meaningful ways (Kachigan 1991; SPSS, Inc. 2003; Tan, Steinbach, and Kumar 2006). To conceptualize this idea, it is helpful to think of the common adage, Birds of a feather flock together. K-means cluster analysis is a statistical algorithm that partitions visitors into a specified number of natural clusters. As an introduction to the algorithm and analysis, consider figure 1, which plots 20 visitors ratings of two statements about sports on two 7-point rating scale variables. In Figure 1, each small data point whether it be a diamond, square, triangle, or circle identifies visitors ratings to two statements about sports. The four shapes of the data points identify visitors cluster membership (Cluster 1, Cluster 2, Cluster 3, Cluster 4). Visitors in Cluster 1 are identified by diamonds. Visitors in Cluster 2 are identified by squares. Visitors in Cluster 3 are identified by triangles, and visitors in Cluster 4 are identified by circles. At approximately the center of each cluster is an enlarged data point, which is the centroid. The centroid is the mean of all the data points in the cluster. Centroids are important to clustering. The distance more technically, the Euclidean distance of each data point to the centroid measures the proximity of the data point to the centroid, and proximity to the centroid determines cluster membership. 4 Figure 1 is a basic visualization of four clusters position on two variables. Nevertheless, it is useful in conceptualizing clusters, especially when thinking about the steps involved in K-means cluster analysis, which is discussed in detail in the next section. How is K-Means Cluster Analysis Conducted? At this point you probably have questions, such as: What instrument do you use to elicit the data for cluster analysis? How do you decide how many clusters to specify for the analysis? Once the computer derives the clusters, how do you interpret them? We answer these questions and others as we guide you through the steps of the analysis. To exemplify each step, we use concrete examples from studies we have conducted and provide a rationale for these steps.
3 curator 52/4 october Figure 1. Scatterplot of visitors ratings of two statements about sports. Step 1: Create a series of rating statements as variables by which to cluster visitors Rating statements are useful, since they go beyond demographics, investigating visitors thoughts, attitudes, and beliefs. Further, rating statements provide interval data that can be analyzed using K-means cluster analysis. In writing rating statements and determining associated rating scales, it is important to consider what you want to know and the question you want to answer. For instance, the Dallas Museum of Art wanted to understand visitors level of engagement with art (Randi Korn and Associates, Inc. 2005), while the San Francisco Museum of Art wanted to understand engagement with art among adults visiting with children between 4 and 11 years (Randi Korn and Associates, Inc. 2009). The statements we designed for each museum are different, since each set of statements consisted of responses to the particular questions each museum was asking. In another study, the Sport Legends Museum at Camden Yards wanted to understand what types of sports fans are visiting the Museum (Randi Korn and Associates Inc. 2008). Table 1 shows the statements we designed to explore types of sports fans. Crafting the statements and rating scales is crucial to the resulting clusters, since the statements are the variables by which the clusters are formed. The statements must directly link to your research question. If not, it is impossible to create natural clusters that are of any practical use. So, for example, when drafting the statements for the Sports Legends Museum at Camden Yards, it was necessary to investigate what attitudes and behaviors characterize different types of sports fan. In conducting such investigations, we recommend using research from the field, input from museum staff, and interviews with museum visitors to inform the statements. Further, it is imperative that you pretest these statements and the rating scale with museum visitors. 5 Step 2: Administer the questionnaire Once you have pretested your rating scale statements and made any necessary changes in the questionnaire, administer the question-
4 366 krantz et al. exploring a museum s audience Table 1. Statements and rating scale drafted for the Sports Legends Museum at Camden Yards (Randi Korn and Associates, Inc. 2008). What kind of sports fan are you? Does not describe me (1) Describes me very well (7) I prefer to watch sports by going to games When my team is losing, I usually feel bad I prefer to watch sports on TV I watch sports to cheer the entire team s effort I watch sports to see the athletes I like Sports has great meaning in my life I like learning about the history of my favorite teams I regularly attend sporting events I regularly participate in sports I go out of my way to learn the latest sports news naire to a randomly selected sample of visitors from the population you are studying. It is important for respondents to rate all of the statements. The K-means cluster program on SPSS and other statistical packages can handle large samples efficiently. Most of the cluster studies at Randi Korn and Associates have sample sizes in excess of 300 respondents. Step 3: Decide or experiment with how many clusters visitors should be grouped into The K-means algorithm requires you to specify in advance the number of clusters to be derived. This can be tricky. We always try more than one cluster solution. We generally ask for two, three, and four cluster solutions, and then review the results to identify which cluster solution is natural, fitting, and practical for the museum. In Step 5, we will discuss how to decide whether clusters are natural. Step 4: The statistical program outputs the clusters If, for example, we specify three clusters, the statistical program begins the algorithm by identifying three initial centroids, often at random. 6 Remember, the centroid is the mean value on all clustering variables of the cluster s members. Next, the statistical program runs a number of passes, or iterations. With each iteration, the program reassigns each case to the cluster with the closest centroid (based on Euclidean distance), and then recalculates the centroids based on the updated cluster membership. Iterations continue until cluster membership stabilizes with a solution that minimizes the variability within each cluster and maximizes the variability between each cluster (Kachigan 1991; SPSS, Inc. 2003; Tan, Steinbach, and Kumar 2006). Again, this is a step that you do not physically do, but rather, the statistical program you are using does. 7
5 curator 52/4 october Step 5: Determine whether the clusters are natural This is one of the most important steps of the cluster analysis because the researchers interpretation and judgment is required to determine whether the clusters are natural. In this way, cluster analysis is exploratory and similar to qualitative analysis. In considering whether the clusters are natural, you should consider the number of clusters defined as well as experiment with more or fewer clusters. Choosing the appropriate number of clusters is admittedly difficult and comes through experience and experimentation with the data (Dubes 1987; Koehly et al. 2001). For instance, it is important to consider whether the cluster membership of any one cluster is too small or too large. A cluster with just 10 members is probably too small to be practical, so the researcher should specify fewer clusters and re-run the analysis. On the other hand, if one cluster is overly large and dominant, the researcher might want to re-run the analysis with more clusters. In our experience, defining three to five clusters brings out the nuances but still creates practical groupings of visitors. However, we acknowledge that every data set is unique, and depending on the research question, there are uses for various numbers of clusters (Kachigan 1991). Once the clusters emerge, we take a close look at each cluster by comparing their demographic characteristics, visit characteristics, psychographic characteristics, and any other variables on the questionnaire. K-means cluster analysis does not always produce useful clusters. For example, you might find that a three-cluster solution simply reflects three age groups (three clusters: young visitors, middle-age visitors, and older visitors, for instance). That particular cluster solution doesn t add anything new to your understanding of visitors, since you have probably already examined your data for age differences. In general, however, we have found that carefully designed rating statements and scales yield data that produce useful clusters, meaning that the clusters enhance our understanding of visitors in interesting and distinctive new ways. K-means Cluster Analysis Is Not for Everyone While we have found that K-means cluster analysis is useful for audience research in museums, some statisticians do not think K-means cluster analysis is rigorous enough. In particular, the random assignment of the cores of clusters is problematic to some, and the researchers determination of natural clusters is problematic to others. Moreover, cluster results may not be robust. Adding cases to an existing data set or using an entirely new data set may yield a cluster solution that is quite different. Nevertheless, K-means is a well-accepted method in social science research, often used in data mining and analysis of social networks, particularly because it is exploratory (Huang 1998; Tan, Steinbach, and Kumar 2006). So, like other exploratory methods, if you find an interesting result, you should view it as a jumping off point for additional, confirmatory research.
6 368 krantz et al. exploring a museum s audience What are the Uses of Cluster Analysis Findings for Museum Practitioners? Before explaining how museum practitioners apply cluster analysis findings to their work, we present findings from studies at different museums. In reading the findings, we encourage you to consider the types of information presented as well as how museums could use the information. Visitors to the Dallas Museum of Art (DMA) In studying visitors level of engagement with art, we identified four clusters of visitors to the Dallas Museum of Art. The DMA labeled them as follows: Curious Participants (32 percent of visitors), Committed Enthusiasts (26 percent), Tentative Observers (23 percent), and Discerning Independents (19 percent) (Randi Korn and Associates 2005). Curious Participants are the largest cluster of visitors. Curious Participants are reasonably comfortable looking at art and want to connect with works of art in a variety of ways, including performances and readings. Visitors in this group have some difficulty with art terminology and are not particularly confident explaining it to others in spite of their reactions to art, which may be more emotional than cerebral. Committed Enthusiasts are the second largest cluster of visitors. Committed Enthusiasts are confident, enthusiastic, highly knowledgeable, and emotionally connected to works of art. They are comfortable looking at art and talking about it. These visitors are sponges for knowledge about art and seek information of all types and formats. Tentative Observers are the second smallest cluster of visitors. Tentative Observers are neither very knowledgeable about art, nor emotionally connected to art. They are uncomfortable talking with others about art, or explaining art to others. They are interested in obtaining straightforward, basic information about works of art. Discerning Independents are the smallest cluster of visitors. Discerning Independents are confident, highly knowledgeable and emotionally connected to works of art. They are comfortable looking at art and talking about it. Discerning Independents want to develop their own interpretations of art and are less interested in others explanations or views. Family Visitors to the San Francisco Museum of Modern Art In studying the level of engagement with art among adults visiting with children between 4 and 11 years, we identified three visitor clusters: Enthusiasts (50 percent of visitors), Art Lovers (27 percent), and Socials (23 percent) (Randi Korn and Associates 2009). Enthusiasts are the largest cluster of visitors. Enthusiasts are fans of art and art museums. They place the highest value on having an educational experience for their children, and they place the lowest value on having a spiritual experience looking at art. Art Lovers are the second largest cluster of visitors. Like Enthusiasts, Art Lovers enjoy art and art museums. However, Art Lovers value looking at art for their own visual pleasure, and they do not value making art with children. Socials are the smallest cluster of visitors. Compared to the other two clusters, socials are less enthusiastic about art and art museums. Socials least value having spiritual experiences while looking at art and most value spending time with family and friends.
7 curator 52/4 october Visitors to the Sports Legends Museum at Camden Yards In studying what kinds of sports fans are visitors to the Sports Legends Museum at Camden Yards, we identified four clusters: Middle-Road Fans (35 percent of visitors), TV Enthusiasts (32 percent), Active Enthusiasts (23 percent), and Indifferent Companions (11 percent) (Randi Korn and Associates 2008). Middle-Road Fans comprise the largest cluster. While interested in sports, they are not emotional, die-hard fans. Middle-Road Fans are interested in watching sports to cheer the entire team s effort, and they prefer going to games rather than watching sports on TV. Middle-Road Fans are moderately attentive to the latest sporting news, and they admit being somewhat unhappy when their favorite teams lose. TV Enthusiasts comprise the second largest cluster. They are highly engaged by sports and are team-connected and athlete-connected. Although TV Enthusiasts regularly attend sporting events, TV Enthusiasts are the only ones who respond favorably to the statement I prefer to watch sports on TV. Like Active Enthusiasts, TV Enthusiasts feel that sports have great meaning in their lives and they go out of their way to learn the latest sporting news. Active Enthusiasts comprise the second smallest cluster. They are highly committed to sports, have a powerful emotional connection to their favorite teams, and go out of their way to learn the latest sporting news. Of the four clusters, Active Enthusiasts have the strongest preference for watching sports by going to games and are least interested in watching sports on TV. They are also far more likely than are members of the other three clusters to regularly participate in sports. Indifferent Companions comprise the smallest cluster of visitors. Indifferent Companions do not relate to sports and would not call themselves sports fans. Indifferent Companions have little interest in sporting news, they do not feel that sports has meaning in their lives, and they do not regularly participate in sports. Applications of Cluster Analysis Findings In talking with museum staff and hearing their reactions, we have found that cluster analysis findings help museum staff come to an understanding of their visitors. The task is challenging, given that human diversity is so complex. Cluster analysis is useful since it allows for the nuances of visitors to emerge, yet it also groups similar visitors according to a specific research question. To clarify, let s further consider the findings from the Dallas Museum of Art (DMA). In studying visitors level of engagement with art at the DMA, we noticed that certain clusters of visitors responded similarly to particular statements. As seen in table 2, which displays the mean rating by each cluster, Curious Participants and Discerning Independents rated the statements I enjoy talking with others about the art we are looking at and I like to know about the materials and techniques used by the artist in similar ways. Discerning Independents and Committed Enthusiasts rated the statements I feel comfortable looking at most types of art and Art affects me emotionally similarly. Table 2 also shows how the clusters are most different. There is great variation in
8 370 krantz et al. exploring a museum s audience Table 2. Ratings of art viewing preferences by cluster (Randi Korn and Associates, Inc. 2005) 7-point rating scale: Does not describe me (1) Describes me very well (7) I like to view a work of art on my own, without explanations or interpretations. 1 I am comfortable explaining the meaning of a work of art to a friend. 2 Tentative observers (n = 256) Curious participants (n = 352) Cluster Discerning independents (n = 211) Committed enthusiasts (n = 284) mean mean mean mean Some terms used in art museums are difficult for me to understand. 3 I enjoy talking with others about the art we are looking at. 4 I like to know about the materials and techniques used by the artist. 5 I feel comfortable looking at most types of art Art affects me emotionally I like to be told a straight-forward insight to help me know what the work of art is about I like to know the story portrayed in a work of art. 9 I like to connect with works of art through music, dance, dramatic performances, readings F = , p =.000; 2 F = , p =.000; 3 F = , p =.000; 4 F = , p =.000; 5 F = , p =.000; 6 F = , p =.000; 7 F = , p =.000; 8 F = , p =.000; 9 F = , p =.000; 10 F = , p =.000. the ratings for the statements I am comfortable explaining the meaning of a work of art to a friend and I like to be told a straightforward insight to help me know what the work of art is about. Another distinguishing statement was: I like to view art on my own without explanations. This statement was instrumental in differentiating the two clusters of visitors with a high level of interest in art: Committed Enthusiasts and Discerning Independents. These two clusters are both at the top level of engagement with art, but one group doesn t want to be told what to think, while the other group is a sponge for information. 8 Understanding how clusters are most similar and dissimilar is extremely useful for marketing, programming, and even exhibition design. For instance, if a museum wanted to reach a broad range of visitors, it may look to find the similarities among clusters. For example, data indicate that the DMA might reach a broad range of visitors by offering a program on art technique, since three clusters Curious Participants, Discerning Inde-
9 curator 52/4 october Table 3. Ratings of sports identity (Randi Korn and Associates, Inc. 2008). 7-point rating scale: Does not describe me (1) Describes me very well (7) Indifferent companions (n = 32) Middle-road fans (n =106) Cluster TV enthusiasts (n = 95) Active enthusiasts (n = 69) mean mean mean mean I like learning about the history of my favorite teams. 1 I watch sports to cheer the entire team s effort. 2 Sports has great meaning in my life. 3 I prefer to watch sports by going to games. 4 I go out of my way to learn the latest sports news. 5 I regularly attend sporting events. 6 When my team is losing I usually feel bad. 7 I watch sports to see the athletes I like. 8 I regularly participate in sports I prefer to watch sports on TV F = , p =.000; 2 F = , p =.000; 3 F = , p =.000; 4 F = , p =.000; 5 F = , p =.000; 6 F = , p =.000; 7 F = , p =.000; 8 F = , p =.000; 9 F = ; p =.000; 10 F = , p =.000 pendents, and Committed Enthusiasts agreed relatively strongly with the statement I like to know about the materials and techniques used by the artist. A museum might also want to know how the clusters are different, since differences indicate aspects that will prove least successful in reaching the broadest audience. For instance, the variation in responses to the statement I like to be told a straightforward insight to help me know what the work of art is about indicates that interpretation that only provides straightforward insights is not a successful strategy for reaching all clusters. While making the museum accessible to everyone is important, museums may also consider focusing on one particular cluster in terms of marketing, programming, and event planning. Selecting one cluster to serve in a program, for example, is highly effective. Museum practitioners can better focus that program if they know the characteristics of a cluster. For instance, the Sports Legends Museum at Camden Yards may consider offering talks with baseball players to target TV Enthusiasts. We know that TV Enthusiasts watch sports to see the athletes they like, like learning about the history of their favorite teams, and stay up-to-date on the latest sports news (Randi Korn and Associates 2008). (See table 3).
10 372 krantz et al. exploring a museum s audience Even when a museum chooses to focus its resources and programming on one cluster, such a decision does not preclude other clusters from attending and enjoying the experience, since visitors will not even know about these high-level decisions. Clusters can inform resource allocation. In this way, museums can use their resources to achieve the most impact by serving one segment of the audience very well. Conclusion We find K-means cluster analysis to be a very useful analysis tool for museums that want to explore their visitors from a particular vantage point. We also use the concept of clustering to demonstrate that museum visitors are complicated and nuanced. When people walk into a museum, their dispositions do not change they are who they are. Museum environments, though, can bring out different elements of visitors inherent personalities and behaviors. We use K-means cluster analysis to explore museum-motivated ideas in order to provide an enhanced understanding of and insight into people. Acknowledgments We would like to thank the Dallas Museum of Art, the San Francisco Museum of Modern Art, and the Sports Legends Museum at Camden Yards for giving us permission to share the results of the studies discussed in this article. Notes 1. Please note that our intention in conducting K-means cluster analysis is simply to understand a particular museum s visitors. We are not using findings to understand a larger or different population such as non-museum visitors. 2. All three studies are cited as unpublished manuscripts because they are the property of the museums that contracted the study. However, thanks to each museum, all three reports are available via the Internet. Audience Research: Levels of Engagement with Art SM, A Two-Year Study, is accessed at the Dallas Museum of Art Web site, Audience Research: Exploring Family Visitors to Art Museums, for SFMOMA, and Audience Research: Sports Legends Museum at Camden Yards 2008 Visitor Survey are available at Please refer to the reports if you desire further information regarding the methodology of each study, including sample sizes and response rates. 3. Other clustering methods include Hierarchical clustering and Two-step clustering. We use K-means clustering because it works efficiently with large data sets of interval or continuous data (SAS Institute 2004; SPSS, Inc. 2003). 4. Euclidean distance is the most commonly used distance metric; in fact, it is the
11 curator 52/4 october only distance metric option available in the K-means clustering algorithms in the SPSS and SAS statistical software packages (SAS Institute 2004; SPSS, Inc. 2003). 5. We cannot stress enough the importance of developing quality rating statements and scales that are pretested. In addition to examining statements and scales for clarity and content, we check the pretest results to see that the statements yield data with sufficient inter-item variability. If there is little variability in the data, you will not find substantial or meaningful distinctions among clusters. This is particularly important when conducting research on museum visitors an audience that is relatively homogeneous. 6. The researcher may also specify the initial centroid values based on previous research. A word of caution: The solution reached by the K-means algorithm depends on the starting points, and different initial centroid values may lead to different cluster solutions. If there is any concern that the cluster solution reflects a local rather than global optimum, the researcher can run the K-means algorithm for a given number of clusters several times using different initial centroid values and then choose the solution with the smallest sum of the squared errors (the error is the distance from each case to the nearest centroid). Both SPSS and SAS programs will save the distance to center for each case in the data file. Ultimately, the validity of any cluster solution must be scrutinized by ascertaining whether or not the clusters differ from each other in distinct, meaningful ways that tell you something interesting and useful about the museum s visitors. This issue is discussed in more detail in the Applications of Cluster Analysis Findings section of the report. 7. Always check the iteration history on the program output to confirm that iterations converged with a stable solution. If there is no convergence, increase the maximum number of iterations and re-run the analysis. 8. This distinction is most interesting because the DMA s LOEA hypothesis states that there should be three levels of engagement with art, implying that there should be only three distinct clusters. References Dubes, R How many clusters are best?: An experiment. Pattern Recognition 20(6): Huang, Z Extensions to the K-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery 2: Kachigan, S. K Multivariate Statistical Analysis: A Conceptual Introduction. New York: Radius Press. Koehly, L., P. Arabie, E. Bradlow, W. Hutchinson How do I choose the optimal number of clusters in cluster analysis? The Journal of Consumer Psychology 10(1/2): Randi Korn and Associates, Inc Audience Research: Exploring Family Visitors to Art Museums. San Francisco, CA: San Francisco Museum of Modern Art. Accessed at
12 374 krantz et al. exploring a museum s audience Randi Korn and Associates, Inc Audience Research: Sports Legends Museum at Camden Yards 2008 Visitor Survey. Baltimore, MD: Sports Legends Museum at Camden Yards. Accessed at Randi Korn and Associates, Inc Audience Research: Levels of Engagement with Art SM, A Two-Year Study, Dallas, TX: Dallas Museum of Art. Accessed at SAS Institute SAS/STAT 9.1 Users Guide. Cary, NC: SAS Publishing. SPSS, Inc SPSS for Windows Computer software. The Apache Software Foundation. Tan, P., M. Steinbach, and V. Kumar Introduction to Data Mining. Boston: Pearson Addison-Wesley.
There are a number of different methods that can be used to carry out a cluster analysis; these methods can be classified as follows:
Statistics: Rosie Cornish. 2007. 3.1 Cluster Analysis 1 Introduction This handout is designed to provide only a brief introduction to cluster analysis and how it is done. Books giving further details are
Segmentation: Foundation of Marketing Strategy
Gelb Consulting Group, Inc. 1011 Highway 6 South P + 281.759.3600 Suite 120 F + 281.759.3607 Houston, Texas 77077 www.gelbconsulting.com An Endeavor Management Company Overview One purpose of marketing
STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and
Clustering Techniques and STATISTICA Case Study: Defining Clusters of Shopping Center Patrons STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table
CLUSTER ANALYSIS FOR SEGMENTATION
CLUSTER ANALYSIS FOR SEGMENTATION Introduction We all understand that consumers are not all alike. This provides a challenge for the development and marketing of profitable products and services. Not every
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar
The Science and Art of Market Segmentation Using PROC FASTCLUS Mark E. Thompson, Forefront Economics Inc, Beaverton, Oregon
The Science and Art of Market Segmentation Using PROC FASTCLUS Mark E. Thompson, Forefront Economics Inc, Beaverton, Oregon ABSTRACT Effective business development strategies often begin with market segmentation,
Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca
Clustering Adrian Groza Department of Computer Science Technical University of Cluj-Napoca Outline 1 Cluster Analysis What is Datamining? Cluster Analysis 2 K-means 3 Hierarchical Clustering What is Datamining?
Clustering UE 141 Spring 2013
Clustering UE 141 Spring 013 Jing Gao SUNY Buffalo 1 Definition of Clustering Finding groups of obects such that the obects in a group will be similar (or related) to one another and different from (or
Data Mining Clustering (2) Sheets are based on the those provided by Tan, Steinbach, and Kumar. Introduction to Data Mining
Data Mining Clustering (2) Toon Calders Sheets are based on the those provided by Tan, Steinbach, and Kumar. Introduction to Data Mining Outline Partitional Clustering Distance-based K-means, K-medoids,
Cluster Analysis. Alison Merikangas Data Analysis Seminar 18 November 2009
Cluster Analysis Alison Merikangas Data Analysis Seminar 18 November 2009 Overview What is cluster analysis? Types of cluster Distance functions Clustering methods Agglomerative K-means Density-based Interpretation
POST-HOC SEGMENTATION USING MARKETING RESEARCH
Annals of the University of Petroşani, Economics, 12(3), 2012, 39-48 39 POST-HOC SEGMENTATION USING MARKETING RESEARCH CRISTINEL CONSTANTIN * ABSTRACT: This paper is about an instrumental research conducted
CSU, Fresno - Institutional Research, Assessment and Planning - Dmitri Rogulkin
My presentation is about data visualization. How to use visual graphs and charts in order to explore data, discover meaning and report findings. The goal is to show that visual displays can be very effective
Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analsis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining b Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining /8/ What is Cluster
Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/8/2004 Hierarchical
Comparison of K-means and Backpropagation Data Mining Algorithms
Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and
Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 by Tan, Steinbach, Kumar 1 What is Cluster Analysis? Finding groups of objects such that the objects in a group will
Strategic Online Advertising: Modeling Internet User Behavior with
2 Strategic Online Advertising: Modeling Internet User Behavior with Patrick Johnston, Nicholas Kristoff, Heather McGinness, Phuong Vu, Nathaniel Wong, Jason Wright with William T. Scherer and Matthew
Data Mining Cluster Analysis: Basic Concepts and Algorithms. Clustering Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analsis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining b Tan, Steinbach, Kumar Clustering Algorithms K-means and its variants Hierarchical clustering
Clustering. 15-381 Artificial Intelligence Henry Lin. Organizing data into clusters such that there is
Clustering 15-381 Artificial Intelligence Henry Lin Modified from excellent slides of Eamonn Keogh, Ziv Bar-Joseph, and Andrew Moore What is Clustering? Organizing data into clusters such that there is
Simple Predictive Analytics Curtis Seare
Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use
Distances, Clustering, and Classification. Heatmaps
Distances, Clustering, and Classification Heatmaps 1 Distance Clustering organizes things that are close into groups What does it mean for two genes to be close? What does it mean for two samples to be
How to Get More Value from Your Survey Data
Technical report How to Get More Value from Your Survey Data Discover four advanced analysis techniques that make survey research more effective Table of contents Introduction..............................................................2
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
Visualization Quick Guide
Visualization Quick Guide A best practice guide to help you find the right visualization for your data WHAT IS DOMO? Domo is a new form of business intelligence (BI) unlike anything before an executive
Measurement Information Model
mcgarry02.qxd 9/7/01 1:27 PM Page 13 2 Information Model This chapter describes one of the fundamental measurement concepts of Practical Software, the Information Model. The Information Model provides
Impact / Performance Matrix A Strategic Planning Tool
Impact / Performance Matrix A Strategic Planning Tool Larry J. Seibert, Ph.D. When Board members and staff convene for strategic planning sessions, there are a number of questions that typically need to
Cluster analysis with SPSS: K-Means Cluster Analysis
analysis with SPSS: K-Means Analysis analysis is a type of data classification carried out by separating the data into groups. The aim of cluster analysis is to categorize n objects in k (k>1) groups,
Local outlier detection in data forensics: data mining approach to flag unusual schools
Local outlier detection in data forensics: data mining approach to flag unusual schools Mayuko Simon Data Recognition Corporation Paper presented at the 2012 Conference on Statistical Detection of Potential
Data Mining. for Process Improvement DATA MINING. Paul Below, Quantitative Software Management, Inc. (QSM)
Data mining techniques can be used to help thin out the forest so that we can examine the important trees. Hopefully, this article will encourage you to learn more about data mining, try some of the techniques
A Cluster Analysis Approach for Banks Risk Profile: The Romanian Evidence
109 European Research Studies, Volume XII, Issue (1) 2009 A Cluster Analysis Approach for Banks Risk Profile: The Romanian Evidence By Nicolae DARDAC 1 Iustina Alina BOITAN 2 Abstract: Cluster analysis,
A successful market segmentation initiative answers the following critical business questions: * How can we a. Customer Status.
MARKET SEGMENTATION The simplest and most effective way to operate an organization is to deliver one product or service that meets the needs of one type of customer. However, to the delight of many organizations
K-Means Clustering Tutorial
K-Means Clustering Tutorial By Kardi Teknomo,PhD Preferable reference for this tutorial is Teknomo, Kardi. K-Means Clustering Tutorials. http:\\people.revoledu.com\kardi\ tutorial\kmean\ Last Update: July
Statistical Databases and Registers with some datamining
Unsupervised learning - Statistical Databases and Registers with some datamining a course in Survey Methodology and O cial Statistics Pages in the book: 501-528 Department of Statistics Stockholm University
New Jersey Core Curriculum Content Standards for Visual and Performing Arts INTRODUCTION
Content Area Standard Strand By the end of grade P 2 New Jersey Core Curriculum Content Standards for Visual and Performing Arts INTRODUCTION Visual and Performing Arts 1.4 Aesthetic Responses & Critique
SPSS Tutorial. AEB 37 / AE 802 Marketing Research Methods Week 7
SPSS Tutorial AEB 37 / AE 802 Marketing Research Methods Week 7 Cluster analysis Lecture / Tutorial outline Cluster analysis Example of cluster analysis Work on the assignment Cluster Analysis It is a
NORTHERN VIRGINIA COMMUNITY COLLEGE PSYCHOLOGY 211 - RESEARCH METHODOLOGY FOR THE BEHAVIORAL SCIENCES Dr. Rosalyn M.
NORTHERN VIRGINIA COMMUNITY COLLEGE PSYCHOLOGY 211 - RESEARCH METHODOLOGY FOR THE BEHAVIORAL SCIENCES Dr. Rosalyn M. King, Professor DETAILED TOPICAL OVERVIEW AND WORKING SYLLABUS CLASS 1: INTRODUCTIONS
Machine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
Statistics and Probability
Statistics and Probability TABLE OF CONTENTS 1 Posing Questions and Gathering Data. 2 2 Representing Data. 7 3 Interpreting and Evaluating Data 13 4 Exploring Probability..17 5 Games of Chance 20 6 Ideas
IBM SPSS Direct Marketing 23
IBM SPSS Direct Marketing 23 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 23, release
Cluster Analysis: Advanced Concepts
Cluster Analysis: Advanced Concepts and dalgorithms Dr. Hui Xiong Rutgers University Introduction to Data Mining 08/06/2006 1 Introduction to Data Mining 08/06/2006 1 Outline Prototype-based Fuzzy c-means
Robust Outlier Detection Technique in Data Mining: A Univariate Approach
Robust Outlier Detection Technique in Data Mining: A Univariate Approach Singh Vijendra and Pathak Shivani Faculty of Engineering and Technology Mody Institute of Technology and Science Lakshmangarh, Sikar,
UNDERSTANDING CUSTOMERS - PROFILING AND SEGMENTATION
UNDERSTANDING CUSTOMERS - PROFILING AND SEGMENTATION Mircea Andrei SCRIDON Babes Bolyai University of Cluj-Napoca Abstract: In any industry, the first step to finding and creating profitable customers
Cluster Analysis for Evaluating Trading Strategies 1
CONTRIBUTORS Jeff Bacidore Managing Director, Head of Algorithmic Trading, ITG, Inc. [email protected] +1.212.588.4327 Kathryn Berkow Quantitative Analyst, Algorithmic Trading, ITG, Inc. [email protected]
IBM SPSS Direct Marketing 22
IBM SPSS Direct Marketing 22 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 22, release
Cluster Analysis. Isabel M. Rodrigues. Lisboa, 2014. Instituto Superior Técnico
Instituto Superior Técnico Lisboa, 2014 Introduction: Cluster analysis What is? Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from
Categorical Data Visualization and Clustering Using Subjective Factors
Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,
Getting Behind The Customer Experience Wheel
Getting Behind The Customer Experience Wheel Create a Voice of the Customer Program for your Organization In any business, serving your customers well is critical to success, loyalty and growth. But do
ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)
ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) Gabriela Ochoa http://www.cs.stir.ac.uk/~goc/ OUTLINE Preliminaries Classification and Clustering Applications
K-means Clustering Technique on Search Engine Dataset using Data Mining Tool
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 6 (2013), pp. 505-510 International Research Publications House http://www. irphouse.com /ijict.htm K-means
Data Mining Cluster Analysis: Advanced Concepts and Algorithms. Lecture Notes for Chapter 9. Introduction to Data Mining
Data Mining Cluster Analysis: Advanced Concepts and Algorithms Lecture Notes for Chapter 9 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004
Attitudes Towards Digital Audio Advertising
Attitudes Towards Digital Audio Advertising A white paper developed by TargetSpot 2012 TargetSpot, Inc. Research by Parks Associates. All Rights Reserved. Forward by: FOREWORD By now everyone realizes
Decision Support System Methodology Using a Visual Approach for Cluster Analysis Problems
Decision Support System Methodology Using a Visual Approach for Cluster Analysis Problems Ran M. Bittmann School of Business Administration Ph.D. Thesis Submitted to the Senate of Bar-Ilan University Ramat-Gan,
Multivariate testing. Understanding multivariate testing techniques and how to apply them to your email marketing strategies
Understanding multivariate testing techniques and how to apply them to your email marketing strategies An Experian Marketing Services white paper Testing should be standard operating procedure Whether
SUPPLY CHAIN SEGMENTATION 2.0: WHAT S NEXT. Rich Becks, General Manager, E2open. Contents. White Paper
White Paper SUPPLY CHAIN SEGMENTATION 2.0: WHAT S NEXT Rich Becks, General Manager, E2open 2 3 4 8 Contents Supply Chain Segmentation, Part II: Advances and Advantages A Quick Review of Supply Chain Segmentation
Clustering Connectionist and Statistical Language Processing
Clustering Connectionist and Statistical Language Processing Frank Keller [email protected] Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised
Analyzing Research Articles: A Guide for Readers and Writers 1. Sam Mathews, Ph.D. Department of Psychology The University of West Florida
Analyzing Research Articles: A Guide for Readers and Writers 1 Sam Mathews, Ph.D. Department of Psychology The University of West Florida The critical reader of a research report expects the writer to
Fairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
Data Mining Project Report. Document Clustering. Meryem Uzun-Per
Data Mining Project Report Document Clustering Meryem Uzun-Per 504112506 Table of Content Table of Content... 2 1. Project Definition... 3 2. Literature Survey... 3 3. Methods... 4 3.1. K-means algorithm...
Ira J. Haimowitz Henry Schwarz
From: AAAI Technical Report WS-97-07. Compilation copyright 1997, AAAI (www.aaai.org). All rights reserved. Clustering and Prediction for Credit Line Optimization Ira J. Haimowitz Henry Schwarz General
10. Analysis of Longitudinal Studies Repeat-measures analysis
Research Methods II 99 10. Analysis of Longitudinal Studies Repeat-measures analysis This chapter builds on the concepts and methods described in Chapters 7 and 8 of Mother and Child Health: Research methods.
Helen Featherstone, Visitor Studies Group and University of the West of England, Bristol
AUDIENCE RESEARCH AS A STRATEGIC MANAGEMENT TOOL Sofie Davis, Natural History Museum Helen Featherstone, Visitor Studies Group and University of the West of England, Bristol Acknowledgements: The authors
Measuring the response of students to assessment: the Assessment Experience Questionnaire
11 th Improving Student Learning Symposium, 2003 Measuring the response of students to assessment: the Assessment Experience Questionnaire Graham Gibbs and Claire Simpson, Open University Abstract A review
Behavioral Segmentation
Behavioral Segmentation TM Contents 1. The Importance of Segmentation in Contemporary Marketing... 2 2. Traditional Methods of Segmentation and their Limitations... 2 2.1 Lack of Homogeneity... 3 2.2 Determining
Constrained Clustering of Territories in the Context of Car Insurance
Constrained Clustering of Territories in the Context of Car Insurance Samuel Perreault Jean-Philippe Le Cavalier Laval University July 2014 Perreault & Le Cavalier (ULaval) Constrained Clustering July
Running head: FROM IN-PERSON TO ONLINE 1. From In-Person to Online : Designing a Professional Development Experience for Teachers. Jennifer N.
Running head: FROM IN-PERSON TO ONLINE 1 From In-Person to Online : Designing a Professional Development Experience for Teachers Jennifer N. Pic Project Learning Tree, American Forest Foundation FROM IN-PERSON
A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS
A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS Stacey Franklin Jones, D.Sc. ProTech Global Solutions Annapolis, MD Abstract The use of Social Media as a resource to characterize
One Size Doesn t Fit All: How a Segmentation Approach Can Help Guide CE Product Strategy
One Size Doesn t Fit All: How a Segmentation Approach Can Help Guide CE Product Strategy Presenters Joe Bates Director of Research Gina Woodall Vice President More than 2,100 members Top 20 Trade Association
Overview of Factor Analysis
Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone: (205) 348-4431 Fax: (205) 348-8648 August 1,
ABSTRACT INTRODUCTION
行 政 院 國 家 科 學 委 員 會 專 題 研 究 計 畫 成 果 報 告 線 上 購 物 介 面 成 份 在 行 銷 訊 息 的 思 考 可 能 性 模 式 中 所 扮 演 角 色 之 研 究 The Role of Online Shopping s Interface Components in Elaboration of the Marketing Message 計 畫 編 號 :
Main Effects and Interactions
Main Effects & Interactions page 1 Main Effects and Interactions So far, we ve talked about studies in which there is just one independent variable, such as violence of television program. You might randomly
K-Means Cluster Analysis. Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1
K-Means Cluster Analsis Chapter 3 PPDM Class Tan,Steinbach, Kumar Introduction to Data Mining 4/18/4 1 What is Cluster Analsis? Finding groups of objects such that the objects in a group will be similar
ANALYZING THE CONSUMER PROFILING FOR IMPROVING EFFORTS OF INTEGRATED MARKETING COMMUNICATION
Olimpia OANCEA Mihaela DIACONU Amalia DUŢU University of Pitesti ANALYZING THE CONSUMER PROFILING FOR IMPROVING EFFORTS OF INTEGRATED MARKETING COMMUNICATION Empirical studies Keywords Market segmentation
Visualizing non-hierarchical and hierarchical cluster analyses with clustergrams
Visualizing non-hierarchical and hierarchical cluster analyses with clustergrams Matthias Schonlau RAND 7 Main Street Santa Monica, CA 947 USA Summary In hierarchical cluster analysis dendrogram graphs
Data Mining. Cluster Analysis: Advanced Concepts and Algorithms
Data Mining Cluster Analysis: Advanced Concepts and Algorithms Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 More Clustering Methods Prototype-based clustering Density-based clustering Graph-based
Organizing Your Approach to a Data Analysis
Biost/Stat 578 B: Data Analysis Emerson, September 29, 2003 Handout #1 Organizing Your Approach to a Data Analysis The general theme should be to maximize thinking about the data analysis and to minimize
DISCRIMINANT FUNCTION ANALYSIS (DA)
DISCRIMINANT FUNCTION ANALYSIS (DA) John Poulsen and Aaron French Key words: assumptions, further reading, computations, standardized coefficents, structure matrix, tests of signficance Introduction Discriminant
Data Mining Individual Assignment report
Björn Þór Jónsson [email protected] Data Mining Individual Assignment report This report outlines the implementation and results gained from the Data Mining methods of preprocessing, supervised learning, frequent
Improving Access and Quality of Health Care for Deaf Populations. A Collaborative Project of the Sinai Health System and. Advocate Health Care
Improving Access and Quality of Health Care for Deaf Populations A Collaborative Project of the Sinai Health System and Advocate Health Care July 1, 2002 December 31, 2012 LESSONS LEARNED Overarching Lesson:
Assessing Online Asynchronous Discussion in Online Courses: An Empirical Study
Assessing Online Asynchronous Discussion in Online Courses: An Empirical Study Shijuan Liu Department of Instructional Systems Technology Indiana University Bloomington, Indiana, USA [email protected]
USING DATA SCIENCE TO DISCOVE INSIGHT OF MEDICAL PROVIDERS CHARGE FOR COMMON SERVICES
USING DATA SCIENCE TO DISCOVE INSIGHT OF MEDICAL PROVIDERS CHARGE FOR COMMON SERVICES Irron Williams Northwestern University [email protected] Abstract--Data science is evolving. In
The So What? Questions
4 The So What? Questions The work begins with questions. Asking a question can be a sacred act. A real question assumes a dialogue, a link to the source from which answers come. Asking a question in a
Clustering. Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016
Clustering Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016 1 Supervised learning vs. unsupervised learning Supervised learning: discover patterns in the data that relate data attributes with
Sample Reporting. Analytics and Evaluation
Sample Reporting Analytics and Evaluation This sample publication is provided with the understanding that company names and related example reporting are solely illustrative and the content does not constitute
Measurement with Ratios
Grade 6 Mathematics, Quarter 2, Unit 2.1 Measurement with Ratios Overview Number of instructional days: 15 (1 day = 45 minutes) Content to be learned Use ratio reasoning to solve real-world and mathematical
The primary goal of this thesis was to understand how the spatial dependence of
5 General discussion 5.1 Introduction The primary goal of this thesis was to understand how the spatial dependence of consumer attitudes can be modeled, what additional benefits the recovering of spatial
Information Retrieval and Web Search Engines
Information Retrieval and Web Search Engines Lecture 7: Document Clustering December 10 th, 2013 Wolf-Tilo Balke and Kinda El Maarry Institut für Informationssysteme Technische Universität Braunschweig
M15_BERE8380_12_SE_C15.7.qxd 2/21/11 3:59 PM Page 1. 15.7 Analytics and Data Mining 1
M15_BERE8380_12_SE_C15.7.qxd 2/21/11 3:59 PM Page 1 15.7 Analytics and Data Mining 15.7 Analytics and Data Mining 1 Section 1.5 noted that advances in computing processing during the past 40 years have
Using Choice-Based Market Segmentation to Improve Your Marketing Strategy
Using Choice-Based Market Segmentation to Improve Your Marketing Strategy Dr. Bruce Isaacson, President of MMR Strategy Group Dominique Romanowski, Vice President of MMR Strategy Group 16501 Ventura Boulevard,
Exploratory Data Analysis with R
Exploratory Data Analysis with R Roger D. Peng This book is for sale at http://leanpub.com/exdata This version was published on 2015-11-12 This is a Leanpub book. Leanpub empowers authors and publishers
Analyzing and interpreting data Evaluation resources from Wilder Research
Wilder Research Analyzing and interpreting data Evaluation resources from Wilder Research Once data are collected, the next step is to analyze the data. A plan for analyzing your data should be developed
Cluster Analysis: Basic Concepts and Algorithms
8 Cluster Analysis: Basic Concepts and Algorithms Cluster analysis divides data into groups (clusters) that are meaningful, useful, or both. If meaningful groups are the goal, then the clusters should
Evaluation: Designs and Approaches
Evaluation: Designs and Approaches Publication Year: 2004 The choice of a design for an outcome evaluation is often influenced by the need to compromise between cost and certainty. Generally, the more
IBM SPSS Direct Marketing 19
IBM SPSS Direct Marketing 19 Note: Before using this information and the product it supports, read the general information under Notices on p. 105. This document contains proprietary information of SPSS
Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
