1 Targeting Key Influentials for Direct Marketing Activities in Social Networks: Methodical Progress and an Application Martin Klaus, Jörg Schwerdtfeger and Ralf Wagner SVI Endowed Chair for International Direct Marketing DMCC - Dialog Marketing Competence Center University of Kassel, Germany Abstract Social networks and virtual communities provide credible information about customers opinions and facilitate innovative customer dialogues. We build upon Kosinets netnography to extract consumer insights from such services. To cope with the overwhelming amount of data recorded by these services, we combine netnography with innovative social network analysis (SNA). We propose a new criterion to assess the number of influentials which should be involved in communication measures or monitored to assess consumers perceptions. Various previous attempts to utilize particular facets of networks or communities for direct marketing have lacked a suitable management framework to guide practitioners. Addressing this gap, we introduce a management cycle of gathering insights and implementing activities. Keywords: Online community, social network analysis, word-of-mouth communication
2 Targeting Key Influentials for Direct Marketing Activities in Social Networks: Methodical Progress and an Application Introduction Online communities like the blogosphere comprise both user generated content (UGC) and an interaction network structure. In the blogosphere, weblogs are the knots, and their links to other weblogs, comments, posts, blogrolls and trackbacks are the edges (Leskovec, 2007). For this paper, we focus on the blogosphere because it is open and freely accessible to everyone without registration or login (Efimova, 2003). Direct marketers aim to utilize virtual networks and communities for communication with their prospective customers or even for selling their products and services directly rather than limiting their activities to monitoring customer insights. Net citizens frequently insist on individual interactions and ignore digital mass communication measures. As a consequence, Web 2.0 communication is expensive, although it may well become highly effective through viral effects. An impressive number of 126 million weblog users (BlogPulse, 2010) poses a major challenge for direct marketers as no structure method exists to cope with analysis and communication tasks related to such numbers. Although progress has been made to categorize the blogs with respect to their influence in the sub-communities (Klaus and Wagner, 2009, Klaus et al., 2009), practitioners are still unsure how many influentials should be monitored. This paper proposes an approach of reducing the mass of data and gaining the important consumer insights from only a few influential actors of online social networks. By seizing the challenge of identifying the relevant knots in a communication network, the methodology covers two distinct direct marketing tasks: first, the identification of information and opinion aggregating individuals, and second the identification of the opining that might trigger viral marketing effects. This paper aims to: Propose a criterion to assess the number of the most relevant knots that should be considered. Demonstrate the extraction of specific consumer insights concerning a chosen topic from selected central weblogs. Propose a management cycle connecting the stages of insight gathering and communicating to a recursive work flow. This will guide practitioners in selecting and analyzing social networks as well as in monitoring the impact of their direct marketing measures. We demonstrate the practicability of this approach using the example of a weblog community discussing mobile communication and cellular phones. In the following section, we give a brief overview of related work. The methodology is outlined in Section 3. Section 4 provides a brief data description, and Section 5 covers both the results and the discussion. Finally, conclusions are drawn in Section 6. Related Work The structure of the Internet and social networks have been studied for many years (e.g., Albert et al., 1999; Barabási, 2007, Milgram, 1967), but the upsurge of communities in the WWW within the past ten years has increasingly attracted the attention of marketing scholars and practitioners (Zhang et al., 2007). The usability of already established social networks for marketing and market research, or as a supplemental instrument for sales forces, has already been investigated. Some of the most recent publications meet the claim of Subramani and
3 Rajagopalan (2003) to overcome the limitations of descriptive accounts of particular initiatives and advice based on anecdotal evidence when considering digital word-of-mouth phenomena. Recent research focuses on quantitative measures related to the results of the activities. These include the adoption of new ideas, products or opinions (Cheung et al., 2008; Ma et al., 2008; De Bruyn and Lilien, 2008), the impact on purchase probabilities (East et al., 2008) and reputation-related issues (Helm, 2000; Reichheld, 2003). Weblogs are of particular interest to marketers because of the high variety of topics discussed within the blogosphere (Glance et al., 2004). The thematic context reconnaissance and the analysis of weblog entries are researched mainly by Nanno et al. (2004), Qamra et al. (2006), Anjewierden et al. (2005), Avesani et al. (2005), Berendt and Navigli (2006), Gill (2004) and Adamic and Glance (2005). Structural characteristics of the blogosphere are researched, among others, by Chin and Chignell (2005), Tseng et al. (2005), Merelo-Guervos et al. (2004) and Leskovec et al. (2007). These studies, however, are restricted either to: considering topic detection and their relation digital communication behavior or position in the communication network. To have a comprehensive understanding, both aspects need to be integrated in the analysis framework (Klaus and Wagner, 2009, Klaus et al., 2009). Methodology The approach to quantitatively reduce the data mass of social network information and afterwards qualitatively gain the aggregated consumer insights of the community uses the characteristics of the Internet and the blogosphere. First, the structure is scale free, which allows us even to analyze smaller networks like thematically differentiated communities with only users (Barbarási, 2007; Ravid and Rafeli, 2004). Second, Broder et al., 2000 have shown in their research that the linking structure of web pages draws topic clusters which they call thematically unified clusters (TUCs). Theses clusters have a highly connected core in which the information of the community is accumulated (Dill et al., 2001). Thus a number of linked weblogs related to a topic draw clusters in which those with very high centrality, if there are any, accumulate the information to the topic. Thus, it is not necessary to read all the weblogs of the network, but only the central ones to access nearly all the information. Third, the distribution of links in the blogosphere follows the long tail phenomenon (Barabási, 2007). The long tail distribution emerges if a majority of individuals have a relatively small number of links to other users. A few users (called hubs) have a high number of link relations to other individuals (Pennock et al., 2002). This means that there are weblogs which have higher centrality and aggregate the information of the TUCs. This approach hints at the need for identifying the relevant individuals. We identify the hubs, weblogs with a promising position, by one of the following three centrality measures: degree, closeness and betweenness, which have been proposed for the identification of important individuals (Everett et al., 1999; Klaus and Wagner, 2009). The degree centrality is deemed to be the dimension of possible communication activity within the network. Thus, we assess how applicative profiles are in order to start canvassing on these profiles with a high degree of centrality. The closeness centrality is deemed to be the dimension of independence from other profiles because the closer the centrality of a profile is, the more direct connections are linked to it. Considering the distance from one profile to all the other profiles in the graph, the closeness centrality indicates how fast a marketing communication measure could spread through the network, starting from one profile.
4 The betweenness centrality assesses the opportunities for controlling the communication process. In this way, communication from these profiles can be monitored and assessed by marketers with a view to influencing them as they wish. Once the most central hubs for the three centrality measures are identified, we extract subnetworks from the original community network, which include the hubs, all weblogs they are directly connected with, called alteri, and the connections between the alteri. These subnetworks lead to a relation between the number of chosen hubs and the number of alteri from the whole community network the hubs reach directly. In addition, we use netnography (Kozinets, 2002) to analyze and interpret the content data from the identified hubs to gain key insights following the four steps of the methodology: 1. Making a cultural entrée 2. Gathering and analyzing data 3. Conducting ethical research 4. Providing opportunities for culture member feedback Finally, this paper s approach is summarized in a management cycle to give an overview of how to use the methodology in research and practice. Application Domain and Data Description We chose the often discussed topic of mobile communication and phones in the Englishspeaking blogosphere to demonstrate exemplarily this paper s approach. Blog URLs related to this topic were collected with weblog search engines like blogsearch.google.com and technorati.com, using key terms like cell phone, mobile phone and mobile communication. Not only general topic-related blogs were searched, but also blogs related to mobile phone brands and products. In total, 499 blog URLs were collected. The data were crawled with the SocSciBot crawler by Thelwall (2004, p. 245ff.). After small subnetworks, which could be identified as blog spam farms, and isolates were excluded from the network data, the final blog community network concerning the topic of mobile communication and phones included 415 English-speaking weblogs with 2,838 edges between them. This weblog network has a density d(whole net) = 0.016% and it is the database for all the following analyses. Results and Discussion The structural analysis of the community with 499 weblogs, dealing with the topic of mobile communication and phones showed first that the community exhibited the characteristics of the long tail phenomenon and was scale free. Thus, it was reasonable to detect hubs in the network with the assumption that these hubs accumulated the information of the network. The following table gives an overview of the results of the analysis using the three centrality measures, their subnetworks and the relation between the hubs and the alteri which can be reached by the hubs. Whole community Degree- Betweenness- Closeness- Network Subnetwork Subnetwork Subnetwork Knots Number of chosen central knots Reached percentage of 57.63% 55.55% 55.55% the main network Edges 2,838 2,421 2,351 2,365 Density Table 1: Overview of the SNA results for the weblog network and the built hub subnetworks
5 The number of hubs from which to extract a subnetwork was determined by the gap criteria (Tibshirani et al., 2001) method which chooses the point where the relation between the number of hubs and the number of new alteri becomes the first time one. This means we chose the number of hubs at the point where one more hub to build the subnetwork would not add even one more alteri only the this hub is connected to in the new subnetwork. For example concerning the degree centrality for active marketing in the network with the gap criteria, the 12 weblogs with the highest degree centrality need to be selected to extract the subnetwork from them, their alteri and the relations among them. Theses 12 hubs reach 266 weblogs by one direct link and their subnetwork encompasses 57.63% of the whole community network. The density of 0.034, which is twice as high as the density of the whole community network, shows that a more strongly connected core is identified, which implies that the information is accumulated on the hubs. Similar results were produced in the analysis for the betweenness and closeness centrality, as shown in Table 1. Next, we used netnography to analyze exemplarily the content of the 12 degree central hubs for the past three months from the day of the structural analysis backwards to gain consumer insights. In total, 71 consumer insights about products and services from 25 different companies were found for five categories: hardware, software, hard- & software, accessories, studies and discussion topics. All insights contained at least one positive or negative valued proposition or gave new ideas, approaches or critiques. It should be pointed out that 61.97% of the consumer insights (CI) deal with one of the four leading brands: Nokia (15 CIs), Apple (14 CIs), HTC (8 CIs) and Samsung (7 CIs). For each of the three centrality measures, there are only a certain number of possibilities for marketing actions in the social network of the blogosphere: active interaction with customers: degree centrality react on posts and comments integrate a corporate weblog in the community monitoring: betweenness centrality subscribe for RSS Feeds read weblogs classic online advertisment: closeness centrality banners and widgets on weblogs ads in comments become listed in blogrolls of hubs direct contact with hub weblogs offer samples and incentives to hub bloggers Finally, the approach of this paper can be summarized in a management cycle, shown in Figure 1. The cycle is composed of seven stages which must be processed consecutively. First, a type of social network like the blogosphere should be chosen as a database. Also, at the same stage, a topic should be defined. At first, all the data, and afterwards only the new, added data need to be crawled. Continuing into the third stage, one or more social network measures will be considered where each measure is related with a different type of marketing activity, as explained above. Using each measure, the most influencing knots of the network can be determined. This helps to reduce the data considerably for the next step, where netnography is used. This methodology requires reading all the content from the influential actors to gain consumer insights. Depending on the measures, which were chosen beforehand, the next stage challenges the execution of a related marketing activity.
6 Figure 1: Management cycle of the analysis of online social networks All the activities have to be controlled. Once they run well, the cycle continues by just updating the new, added content periodically and then checking if this leads to any changes in the measures or brings out new insights. If the data do not bring out enough or only bad results, the social network and/or the topic can be expanded or changed. Following this management cycle might help businesses to generate structured procedures to use consumer-generated content combined with the structure of online networks to start very promising direct marketing activities. Conclusion In this paper, we aim to extend the methodology for revealing the systematic patterns of influencers and recipients in online social networks. We analyzed the characteristics of the Internet and the blogosphere to show that they are well suited for digital word-of-mouth communication and viral marketing activities. Moreover, we outlined three measures for the degree of centrality of an individual and the related interpretation. In the case of our weblog network, we found that involving about 8-12 hubs ( %) would be sufficient to bring up to 57.63% of the whole weblog community into contact with the marketing communication. The hubs were examined to gain consumer insights about brands, products and topics from which marketing action alternatives were finally derived. The selection of important weblogs out of the whole network meant an expedient reduction of burdens in the subsequential netnographical analysis, whereby marketers can reach more customers with less effort in marketing activities. The stepwise procedure is assembled in the management cycle. This emphasizes the benefit of complementing the qualitative analysis with a quantitative social network analysis. Continuing working in this research area, we need to evaluate two mode networks with additional information describing the actors in order to provide researchers with even more informative clustering results.
7 References Adamic, L.A., Adar, E., How to search a social network. Social Networks 27(3), Albert, R., Jeon, H., Barabási, A.L., Diameter of the World-Wide Web. Nature 401, 9. September Anjewierden, A., Brussee, R., Efimova, L., Shared conceptualizations in weblogs. Vienna. Avesani, P., Cova, M., Hayes, C., Massa, P., Learning contextualised weblog topics. Barabási, A.L., The architecture of complexity: from network structure to human dynamics. IEEE Control System Magazine, 27(4), Berendt, B., Navigli, R., Finding your way through blogspace: Using semantics for cross-domain blog analysis. In CAAW: Proc. AAAI 2006 Symposium on Computational Approaches to Analysing Weblogs. Stanford, CA: March 2006 (1-8). Technical Report SS Menlo Park, CA: AAAI Press. BlogPulse, Broder, A, Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A. and Wiener, J., Graph structure in the web. In: Proc. 9 th World Wide Web Conf., S , Amsterdam, Elsevier Science, URL: 0the%20web.pdf BWT, Bundesministeriums für Wirtschaft und Technologie: 11. Faktenbericht 2008, URL:http://www.bmwi.de/BMWi/Navigation/Service/publikationen,did= html?view= renderprint. Cheung, C.M.K., Lee, M.K.O., Rabjohn, N., The impact of electronic word-of-mouth: The adoption of online opinions in online customer communities. Internet Research, 18(3), Chin, A., Chignell, M., Finding evidence of community from blogging co-citations: A social network analytic approach. In Proceedings of the 17th International ACM Conference on Hypertext and Hypermedia: Tools for Supporting Social Structures, Odense, Denmark, De Bruyn, A., Lilien, G. L., How Feedback Influences Evaluations of Model-Based Decision Support Systems. Information Systems Research. Dill, S., Kumar, R., McCurley, K., Rajagopalan, S., Sivakumar, D., Tomkins, A., Selfsimilarity in the web. Proc. of the 27 th VLDB Conference, Roma, Italy. East, R., Hammond, K., Lomax, W., Measuring the Impact of Positive and Negative Word of Mouth on Brand Purchase Probability, International Journal of Research in Marketing, 25 (3),
8 Efimova, L., Blogs: the stickiness factor. In Blogtalk 2003: A European conference on weblogs. Vienna, Everett, M., Borgatti, S.P., The centrality of groups and classes. Journal of Mathematical Sociology, 23(3), Everett, M., Borgatti, S.P., Ego network betweenness. Social Networks, 27(1), Gill, K. E., How can we measure the influence of the blogosphere? New York. Glance, N. S., Hurst, M., and Tomokiyo, T., Blogpulse: Automated trend discovery for weblogs. In WWW 2004 Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics, 2004, at WWW 04: the 13th international conference on World Wide Web Helm, S., Viral Marketing Establishing Customer Relationships by Word-of-mouse. Electronic Markets, 10(3), Klaus, M., Schwertfeger, J., Wagner, R., Assessing Digital Social Networks for Marketing Communication The Case of MySpace.Com, in: Helfer, J.-P.; J.-L. Nicolas (Eds.): Marketing and the Core Disciplines: Rediscovering References? Proceedings of the 38th EMAC Conference, Nantes. Klaus, M., Wagner, R., Identifying Influential Communicator to gain Consumer Insights on Weblog Networks, in: Mavondo, F.; Ewing, M. (Eds.): Proceedings of ANZMAC Conference Kozinets, R., The Field Behind the Screen: Using Netnography for Marketing Research in Online Communities. Journal of Marketing Research, 39(1), Leskovec, J., M. McGlohon, C. Faloutsos, N. Glance, and Hurst, M., 2007.Cascading Behavior in Large Blog Graphs by SIAM International Conference on Data Mining (SDM). URL.: Ma, H., Yang, H., Lyu M.R., and King, I., Mining social networks using heat diffusion processes for marketing candidates selection, CIKM 08: Proceeding of the 17th ACM Conference on Information and Knowledge Mining, New York: ACM, Merelo-Guervos, J., Prieto, B., Prieto, A., Romero, G., Castillo-Valdivieso, P., and Tricas, F., Clustering web-based communities using self-organizing maps. In IADIS conference on Web-Based Communities Milgram, S., The small-world problem. Psychology Today, Nanno, T., Fujiki, T., Suzuki, Y., Okumura, M., Automatically collecting, monitoring, and mining Japanese weblogs. ACM Press. Pennok, D., Flake, G., Lawrence, S., Glover, E., and Giles, C., Winners don`t take all: Characterizing the competition for links on the web. Proceedings of the National Academic of Science, 99(8),
9 Quamra, A., B. Tseng, E., Chang, Y., Mining blog stories using community-based and temporal clustering. CIKM 06: Proceeding of the 17th ACM Conference on Information and Knowledge Mining. Arlington, ACM. URL: qamra.pdf Ravid, G., Rafeli, S., Asynchronos discussion groups as small world and scale free networks. First Monda, Peer reviewed journal on the Internet, URL: Reichheld, F.F., The number you need to grow, Harvard Business Review, 81(12), Shifry, D., State of the Blogospere URL: of-the-blogosphere Subramani, M.R., Rajagopalan, B., Knowledge-sharing and influence in online social networks via viral marketing. Communications of the ACM, 46(12): , Tibshirani, R., Walther, G., Hastie, T., Estimating the number of clusters in a dataset via the gap statistic. Journal of the Royal Statistical Society B, 63 (2), Thelwall, M., Link analysis: An information science approach. San Diego: Academic Press. Tseng, B.L., Tatemura, J. and Wu, Y., Tomographic clustering to visualize blog communities as mountain views. In WWW Zhang, J., Ackerman, M., and Adamic, L Expertise networks in online communities: structure and algorithms. Proceedings of the International World Wide Web Conference, New York, ACM,