PRODUCTIVITY EFFECTS OF INFORMATION DIFFUSION



Similar documents
Can Auto Liability Insurance Purchases Signal Risk Attitude?

Network Structure & Information Advantage

An Interest-Oriented Network Evolution Mechanism for Online Communities

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

An Alternative Way to Measure Private Equity Performance

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

Management Quality and Equity Issue Characteristics: A Comparison of SEOs and IPOs

Gender differences in revealed risk taking: evidence from mutual fund investors

An Empirical Study of Search Engine Advertising Effectiveness

DEFINING %COMPLETE IN MICROSOFT PROJECT

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

What is Candidate Sampling

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

How To Calculate The Accountng Perod Of Nequalty

Calculation of Sampling Weights

Factors Affecting Outsourcing for Information Technology Services in Rural Hospitals: Theory and Evidence

CHAPTER 14 MORE ABOUT REGRESSION

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES

Heterogeneous Paths Through College: Detailed Patterns and Relationships with Graduation and Earnings

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

The OC Curve of Attribute Acceptance Plans

PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIGIOUS AFFILIATION AND PARTICIPATION

A powerful tool designed to enhance innovation and business performance

Analysis of Premium Liabilities for Australian Lines of Business

Criminal Justice System on Crime *

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Management Quality, Financial and Investment Policies, and. Asymmetric Information

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

DO LOSS FIRMS MANAGE EARNINGS AROUND SEASONED EQUITY OFFERINGS?

Searching and Switching: Empirical estimates of consumer behaviour in regulated markets

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE

Are Women Better Loan Officers?

HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA*

A Multistage Model of Loans and the Role of Relationships

Marginal Returns to Education For Teachers

Kiel Institute for World Economics Duesternbrooker Weg Kiel (Germany) Kiel Working Paper No. 1120

1.1 The University may award Higher Doctorate degrees as specified from time-to-time in UPR AS11 1.

The Choice of Direct Dealing or Electronic Brokerage in Foreign Exchange Trading

Forecasting the Direction and Strength of Stock Market Movement

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

STAMP DUTY ON SHARES AND ITS EFFECT ON SHARE PRICES

Traditional versus Online Courses, Efforts, and Learning Performance

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

! # %& ( ) +,../ # 5##&.6 7% 8 # #...

14.74 Lecture 5: Health (2)

Returns to Experience in Mozambique: A Nonparametric Regression Approach

1. Measuring association using correlation and regression

SIMPLE LINEAR CORRELATION

Proceedings of the Annual Meeting of the American Statistical Association, August 5-9, 2001

The impact of hard discount control mechanism on the discount volatility of UK closed-end funds

HARVARD John M. Olin Center for Law, Economics, and Business

Survive Then Thrive: Determinants of Success in the Economics Ph.D. Program. Wayne A. Grove Le Moyne College, Economics Department

Statistical Methods to Develop Rating Models

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending

Activity Scheduling for Cost-Time Investment Optimization in Project Management

Properties of Indoor Received Signal Strength for WLAN Location Fingerprinting

A Model of Private Equity Fund Compensation

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

Analyzing Search Engine Advertising: Firm Behavior and Cross-Selling in Electronic Markets

Data Mining from the Information Systems: Performance Indicators at Masaryk University in Brno

Two Faces of Intra-Industry Information Transfers: Evidence from Management Earnings and Revenue Forecasts

ADVERSE SELECTION IN INSURANCE MARKETS: POLICYHOLDER EVIDENCE FROM THE U.K. ANNUITY MARKET *

The Current Employment Statistics (CES) survey,

Efficient Project Portfolio as a tool for Enterprise Risk Management

Transition Matrix Models of Consumer Credit Ratings

When Talk is Free : The Effect of Tariff Structure on Usage under Two- and Three-Part Tariffs

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

The Complementarities of Competition in Charitable Fundraising

High Correlation between Net Promoter Score and the Development of Consumers' Willingness to Pay (Empirical Evidence from European Mobile Markets)

Using an Ordered Probit Regression Model to Assess the Performance of Real Estate Brokers

THE DETERMINANTS OF THE TUNISIAN BANKING INDUSTRY PROFITABILITY: PANEL EVIDENCE

STATISTICAL DATA ANALYSIS IN EXCEL

Recurrence. 1 Definitions and main statements

Demographic and Health Surveys Methodology

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

Small pots lump sum payment instruction

How To Trade Water Quality

Scale Dependence of Overconfidence in Stock Market Volatility Forecasts

Subcontracting Structure and Productivity in the Japanese Software Industry

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

Whose Private Benefits of Control. Owners or Managers?

Estimating Total Claim Size in the Auto Insurance Industry: a Comparison between Tweedie and Zero-Adjusted Inverse Gaussian Distribution

Do Changes in Customer Satisfaction Lead to Changes in Sales Performance in Food Retailing?

Transcription:

PRODUCTIVITY EFFECTS OF INFORMATION DIFFUSION IN EMAIL NETWORKS Snan Aral NYU Stern School of Busness 44 West 4 th St., 8-81, NY, NY, 10012. Erk Brynjolfsson MIT Sloan School of Management E53-313, Cambrdge, MA, 02142. Marshall Van Alstyne Boston Unversty & MIT 595 Commonwealth Avenue, Boston, MA, 02215. Abstract We examne the dffuson of dfferent types of nformaton through emal networks and the effects of these dffuson patterns on the performance of nformaton workers. In partcular, we ask: What predcts the lkelhood of ndvduals becomng aware of a new pece of nformaton, and how quckly they obtan t? Do dfferent types of nformaton exhbt dfferent dffuson patterns, and do dfferent characterstcs of socal structure, relatonshps and ndvduals n turn affect access to dfferent knds of nformaton? Does better access to nformaton n turn predct an ndvdual s ablty to complete projects or generate revenue? We characterze the socal network of a medum szed executve recrutng frm usng accountng data on project co-work relatonshps and ten months of emal traffc. We dentfy two dstnct types of nformaton dffusng over ths network event news and dscusson topcs by ther usage characterstcs, and observe several thousand dffuson processes of each type of nformaton. We fnd the dffuson of news, whch s characterzed by a spke n communcaton and rapd, pervasve dffuson through the organzaton, s nfluenced by demographc and network factors but not by functonal relatonshps (e.g. pror co-work, authorty) or the strength of tes. In contrast, the dffuson of dscusson topcs, whch exhbt shallow dffuson characterzed by back-and-forth conversaton, s heavly nfluenced by functonal relatonshps and the strength of tes, as well as demographc and network factors. Dscusson topcs are more lkely to dffuse vertcally up and down the organzatonal herarchy, across relatonshps wth a pror workng hstory, and across stronger tes, whle news s more lkely to dffuse laterally as well as vertcally, and wthout regard to the strength or functon of relatonshps. We also fnd access to nformaton strongly predcts project completon and revenue generaton. The effects are economcally sgnfcant: Each addtonal new word seen by an ndvdual s correlated wth about $70 of addtonal revenue generated by that ndvdual. Our fndngs provde some of the frst evdence of the economc sgnfcance and nature of nformaton dffuson n emal networks. Keywords: Emal, Socal Networks, Informaton Dffuson, Productvty We are grateful to the Natonal Scence Foundaton (Career Award IIS-9876233 and grant IIS-0085725), Csco Systems, France Telecom and the MIT Center for Dgtal Busness for generous fundng. We thank Tm Choe and Jun Zhang for ther treless research assstance, and Ron Burt for valuable comments. 1

Aral, Brynjolfsson & Van Alstyne Introducton The process of nformaton dffuson through socal groups les at the heart of numerous busness phenomena n ndustral organzaton, strategy, productvty, fnance, marketng, and nnovaton. Theores on subjects as wde rangng as the dffuson of nnovatons (e.g. Rogers 1995), dynamc tradng behavor (e.g. Hrshlefer et. al. 1994), and the mechancs of word of mouth marketng (e.g. Dellarocas 2003), rely on nformaton dffuson as a central theoretcal buldng block, makng mportant assumptons about how nformaton spreads between ndvduals. Tmely access to strategc nformaton, nnovatve deas, or current news can also hghlght hdden opportuntes, provde negotatng leverage (Burt 1992), promote nnovaton (Burt 2004), and ultmately drve economc performance (Reagans & Zuckerman 2001, Hansen 2002, Aral, Brynjolfsson & Van Alstyne 2006). But, whle theores based on nformaton dffuson prolferate, emprcal evdence on how nformaton spreads through socal groups and the ultmate economc effects of nformaton dffuson reman scarce. Combnng economc methods wth computer scence data capture s an approach unquely well postoned to address ths gap. Dffuson studes are typcally theoretcal (Jackson & Yarv 2005) or smulaton based, or observe adopton or purchase decsons rather than the actual flow of nformaton. Exstng theory focuses manly on whch global socal structures maxmze dffuson, and although we know that transfers of certan types of nformaton are easer than others (Von Hppel 1998), dffuson studes typcally treat nformaton as homogenous, makng varaton n dffuson patterns across dfferent nformaton types dffcult to theorze. These gaps n research gve rse to a natural set of questons about the movement of nformaton through populatons: How does nformaton dffuse through a gven socal group? What makes someone more lkely to be exposed to an dea as t spreads? Do dfferent types of nformaton dffuse dfferently? Can we explctly lnk access to novel nformaton to changes n performance? We study the movement of dfferent types of nformaton through one organzaton over two years to understand how t dffuses and how dffuson patterns affect the relatve productvty of nformaton workers. We argue that the dual effect of content and structure jontly predct the dffuson path of a gven pece of nformaton that both the type of nformaton and the types of socal relatonshps or structures through whch t passes affect the dffuson path. Whle one type of nformaton may be more lkely to dffuse upward through the organzatonal herarchy or strctly across functonal relatonshps, another may dffuse laterally or wthout regard to functon or herarchy. To test our theory, we characterze the socal network of a medum szed executve recrutng frm usng ten months of emal data and accountng data on project co-work relatonshps. We dentfy two types of nformaton dffusng through ths network event news and dscusson topcs by ther usage characterstcs, and observe several thousand dffuson processes of each type. We then test the effects of network, functonal, organzatonal and demographc characterstcs of dyadc relatonshps and ndvduals on the lkelhood of recevng each type of nformaton and recevng t sooner, and the effects of access to nformaton on performance. Our results demonstrate that the dffuson of news, characterzed by a spke n communcaton and rapd, pervasve dffuson through the organzaton, s nfluenced by demographc and network factors but not by functonal relatonshps (e.g. pror co-work, authorty) or the strength of tes. In contrast, dffuson of dscusson topcs, whch exhbt more shallow dffuson characterzed by back-and-forth conversaton, s heavly nfluenced by functonal relatonshps and the strength of tes, as well as demographc and network factors. Dscusson topcs are more lkely to dffuse vertcally up and down the organzatonal herarchy, across relatonshps wth a pror workng hstory, and across stronger tes, whle news s more lkely to dffuse laterally as well as vertcally, and wthout regard to the strength or functon of relatonshps. We also fnd that access to nformaton strongly predcts employees productvty. Tmely access to more nformaton predcts the number of projects completed by each ndvdual and the amount of revenue each person generates holdng constant demographc and tradtonal human captal characterstcs. These effects are economcally sgnfcant, wth each addtonal word seen by an ndvdual correlated wth about $70 of addtonal revenue generated. Conversely, productvty suffers notceably the longer t takes an employee to receve nformaton. Our fndngs provde some of the frst evdence of the economc sgnfcance of nformaton dffuson n emal networks. Theory & Lterature Informaton Dffuson Its Importance & the Current State of Knowledge Theores of nnovaton dffuson (e.g. Rogers 1995) ultmately rely on nformaton dffuson as a central mechansm drvng adopton decsons. Potental adopters are exposed to new nnovatons and are convnced to adopt through processes by whch partcpants create and share nformaton wth one another n order to reach mutual understandng (Rogers 1995: 17). As Rogers (1995: 17-18) descrbes, the essence of the dffuson process s the nformaton exchange through whch an ndvdual communcates a new dea to one or several others. Informaton dffuson also underles several theores of word of mouth marketng and dynamcal tradng behavor n 2

Productvty Effects of Informaton Dffuson n Emal Networks fnancal markets. Hrshlefer et. al. (1994) demonstrate that temporal asymmetres n the dffuson of nformaton to traders create abnormal profts for the nformed and explan seemngly rratonal tradng equlbra, such as herdng or outcomes based on follow the leader strateges. Yet, n these models temporal asymmetres n nformaton acquston are taken as gven, and how and why these systematc asymmetres arse remans unknown. Current nformaton dffuson studes typcally rely on computer smulatons of a handful of agents (e.g. Buskens & Yamaguch 1999, Newman et. al. 2002, Reagans & Zuckerman 2006) and treat nformaton as homogeneous (e.g. Buskens & Yamaguch 1999, Wu et. al. 2004, Newman et. al. 2002, Reagans & Zuckerman 2006). Much of current lterature concerns maxmzng the spread of nfluence through a socal network by dentfyng nfluental nodes lkely to trgger pervasve nformaton cascades (e.g. Domngos & Rchardson 2001, Newman et. al. 2002, Kempe, Klenberg, Tardos 2003), or enumeratng characterstcs of nformaton cascades (e.g. Leskovec, Sngh, Klenberg 2006). 1 A current focus on global network propertes that maxmze nformaton dffuson (e.g. Watts & Stogatz 1998) deemphaszes predctors of access to nformaton cascades and ther economc consequences. 2 In addton, nformaton homogenety assumptons are problematc n lght of evdence demonstratng dfferences n transfer effectveness across dfferent types of nformaton (Reagans & McEvly 2003). Some nformaton s smply stcker (Von Hppel 1998) and more dffcult to transfer (Hansen 1999) due to ts specfcty (Nelson 1990), complexty (Uzz 1997, Hansen 1999), the amount of related knowledge of the recever (Cohen & Levnthal 1990, Hansen 2002), or the degree to whch nformaton s declaratve or procedural (Cohen & Bacdayan 1994). These factors make t unlkely that all types of nformaton exhbt unform transfer rates or dffuson patterns across dfferent socal structures. Although there s a body of lterature on knowledge transfers and performance (e.g. Reagans & Zuckerman 2001), most of ths work remans agnostc wth respect to content (Hansen 1999: 83) and only consders whether knowledge s flowng rather than the type of knowledge beng transferred. A related lterature examnes condtons under whch knowledge and nformaton flow effcently between busness unts and ndvduals (e.g. Hansen 1999, 2002, Reagans & McEvly 2003), but ths work focuses on dyadc transfers of nformaton rather than on the dffuson paths of nformaton through a populaton. In the next secton, we develop theory suggestng that the strength and functon of socal relatonshps, geographc proxmty, organzatonal boundares, and herarchy, authorty and status dfferences across socal groups affect the movement of nformaton, and have dfferent effects across dfferent types of nformaton. In so dong, we propose three extensons to current work. Frst, n addton to network structures, we argue that there are herarchal, demographc and task based drvers of nformaton dffuson. For example, nformaton may dffuse more readly vertcally (or laterally) through an organzatonal herarchy due to authorty or status dfferences, or more quckly through functonal relatonshps than strong tes per se. Second, we hypothesze that dfferent types of nformaton content dffuse dfferently. Thrd, we argue that content and structure jontly predct the dffuson path nformaton - that dfferent socal and structural factors wll govern the dffuson of dfferent types of nformaton. We then argue that tmely access to novel nformaton should mprove decsons and productvty. Employees who are aware of new deas and nformaton are better able to solve problems, mprove decsons and conclude projects. Socal Drvers of Informaton Dffuson We hypothesze four categores of factors that may mpact nformaton dynamcs n organzatons: 1. Demography & Demographc Dstance. Indvduals demographc characterstcs and dssmlarty are lkely to affect socal choces about nformaton seekng and nformaton transmsson. Smlar ndvduals tend to flock together n socal relatonshps (McPherson, Smth-Lovng, & Cook 2001), creatng party n perspectves, nformaton and resources across demographcally smlar ndvduals n organzatons (Burt 1992, Reagans & Zuckerman 2001). Demographc dversty can also create socal dvsons and tenson (Pfeffer 1983), reducng the lkelhood that ndvduals wll go to each other for advce or pass nformaton (Blau 1977). We therefore measure 1 Leskovec, Sngh & Klenberg (2006: 1) fnd that cascades n onlne recommendaton networks tend to be shallow, but occasonally large bursts of propagaton appear such that the dstrbuton of cascade szes s approxmately heavy-taled. 2 Two core models have emerged to explan the dffuson of nfluence and contagon. Threshold models post that ndvduals adopt nnovatons after surpassng ther own prvate threshold (e.g. Granovetter 1978, Schellng 1978). Cascade models post that each tme an adjacent ndvdual adopts, the focal actor adopts wth some probablty that s a functon of ther relatonshp (e.g. Kempe, Klenberg, Tardos 2003). Whle both models assume an nformaton transmsson between adopters and nonadopters, they rarely specfy the nature of the nformaton or the condtons under whch exchanges take place. Rather, the dffuson process s typcally tested under varous assumptons about the dstrbuton of thresholds or dyadc adopton probabltes n the populaton. In fact, as Kempe, Klenberg, Tardos (2003: 2) explan the fact that [thresholds] are randomly selected s ntended to model our lack of knowledge of ther values. 3

Aral, Brynjolfsson & Van Alstyne the demographc characterstcs of ndvduals and the demographc dssmlarty of pars of ndvduals focusng on age, gender, and educaton, three of the most mportant varables n organzatonal demography. 3 2. Organzatonal Herarchy. Formal structures defne reportng relatonshps and work dependences that necesstate communcaton and coordnaton (Mntzberg 1979). Mangers and employees frequently communcate to manage admnstratve tasks even when they are not workng on the same projects, and the mportance of notfcaton for accountablty, and recognton for upward moblty encourages dalogue and nformaton exchange along herarchcal lnes. Embedded wthn formal organzatonal herarches are gradents of status and authorty that may also gude nformaton flows (Blau 1977). As project teams n our organzaton are organzed herarchcally, task related nformaton s lkely to flow vertcally rather than laterally across organzatonal rank. We therefore measure each ndvdual s poston n the organzatonal herarchy (e.g. partner, consultant, and researcher). 3. Functonal Task Characterstcs. Workng relatonshps are conduts of nformaton flow. They necesstate exchanges of task related nformaton and create relatvely stable tes that ndvduals rely on for future projects. However, relatonshps can decay over tme (Burt 2002), and repeated relatonshps are more lkely to create long term conduts through whch nformaton dffuses. People can also seek advce laterally from peers as dstnct from seekng drecton vertcally from superors. We therefore measure the strength of project co-work relatonshps by the number of projects employees have worked on together. We also know from the lterature on absorptve capacty (Cohen & Levnthal 1990) that related knowledge helps ndvduals consume new nformaton, and ndvduals n related felds and of related expertse are more lkely to swm n the same pools of nformaton. We therefore also measure whether or not employees work n the same expertse area. We expect nformaton to dffuse more easly between employees wth the same ndustry tenure, who have been through smlar work related mlestones and may already be famlar wth one another through ndustry relatonshps (Pfeffer 1983). Status and authorty dfferences also may prevent less experenced workers from solctng or sharng nformaton across ndustry tenure gradents whle more experenced workers, less constraned by status and authorty dfferences, may rely on other experenced workers for nformaton. 4. Te & Network Characterstcs. Informal networks are also lkely to mpact nformaton dffuson n organzatons. A vast lterature treats the relatonshp between socal network structure and performance (e.g. Burt 1992, Cummngs & Cross 2003). Although most of ths work does not measure nformaton flows explctly, evdence of a relatonshp between performance and network structure s typcally assumed to be due n part to the nformaton flowng between connected actors (Burt 1992, Reagans & Zuckerman 2001). As ndvduals nteract more frequently, they are lkely to pass nformaton to one another. We therefore measure the strength of communcaton tes by the total volume of emal passng between each par of ndvduals n our network. Other studes demonstrate that betweenness centralty B ( n ) (Freeman 1979), 4 whch measures the probablty that the ndvdual wll fall on the shortest path between any two other ndvduals lnked by emal communcaton, predcts the total amount of knowledge acqured from other parts of the network (Hansen 1999), and that actors wth hgh network constrant C (Burt 1992: 55) 5 (a proxy for the redundancy of contacts) are less prvy to new nformaton (Burt 1992). We therefore measure ndvduals betweenness centralty and ther constrant as follows: B( n ) = g jk ( n ) g jk ; C = pj + pq pqj, q, j. j< k j q 5. Geographc Dstance. Fnally, a great deal of evdence lnks physcal proxmty to communcaton between actors (e.g. Allen 1977). In the case of emal, geographc dstance may be assocated wth more emal communcaton between actors who fnd t costly to communcate face to face. We therefore measure physcal proxmty by whether two people work n the same offce. Dmensons of Informaton Content Characterstcs of nformaton content are also lkely to affect dffuson patterns. Certan types of nformaton are stcker and have hgher transfer costs (Von Hppel 1998). We descrbe two contrastng nformaton types, event news and dscusson topcs, whch serve as vgnettes for comparson. 6 2 3 We do not have access to race or organzatonal tenure varables (although we do measure ndustry tenure). 4 Where g jk s the number of geodesc paths lnkng j and k and g jk (n ) s the number of geodesc paths lnkng j and k nvolvng. 5 Where p j + p q p qj measures the proporton of s contacts drectly or ndrectly nvolvng j; C sums across all of s contacts. 6 These vgnettes are ntended as archetypes, not mutually exclusve categores. These archetypes evoke underlyng usage characterstcs of nformaton that are lkely correlated wth dffuson patterns of partcular words n emal. Our contenton s that 4

Productvty Effects of Informaton Dffuson n Emal Networks Event News. We defne event news as smple, declaratve, factual nformaton that s lkely trggered by an external event and s of general nterest to many people n the organzaton. In the context of our research ste, employees may learn of forthcomng layoffs at a source company, a forthcomng change n company polcy, or a sgnfcant change n top management through a rapd pervasve nformaton cascade that travels quckly and pervasvely throughout the organzaton. Such nformaton s lkely smple, declaratve and factual, nformng recpents of an event that has or wll soon take place. It s also lkely to be of general nterest to most employees n the frm and be wdely shared among many people and across organzatonal and herarchcal boundares. Dscusson Topcs. We defne dscusson topcs as more specfc, complex, and procedural, characterzed by back and forth dscusson of nterest to lmted, specalzed groups. At ths frm, work groups dscuss partcular projects, and most frequently have back and forth dalog about partcular canddates or clents. Canddate names may volley back and forth as ndvdual merts for a partcular job are beng consdered. Teams specalzng n fllng nursng job vacances n the south eastern Unted States may crculate names among other recruters who specalze n the same type of job n the same regon. Theores of nformaton transfer support our dstnctons between event news and dscusson topcs. Complex knowledge s more dffcult and costly to transfer requrng strong dyadc tes for effectve transfers (Hansen 1999). A theoretcal dstncton s also made between declaratve and procedural nformaton (Cohen & Bacdayan 1994, Bulkley & Van Alstyne 2005), wth the former consstng of facts, propostons and events, and the later of nformaton about how to accomplsh tasks, actvtes or routnes (Cohen & Bacdayan 1994: 557). We argue that event news s more lkely to be smple and declaratve, and thus more easly transferred wdely among dfferent types of people. Nelson (1990) and Von Hppel (1998) also make the dstncton between specfc and generc nformaton and knowledge, argung that, n contrast to the specfc, generc knowledge not only tends to be germane to a wde varety of uses and users. Such knowledge s the stock n trade of professonals n a feld so that when new generc knowledge s created anywhere, t s relatvely costless to communcate to other professonals (Nelson 1990: 11-12, as quoted n Von Hppel 1998: 431). 7 Fnally, transfers of nformaton and knowledge are more effectve among ndvduals wth related knowledge (Cohen & Levnthal 1990, Hansen 2002). Those wth smlar expertse or specalzaton are more lkely to share nformaton due to ther shared common nterests and ther ablty to more effectvely communcate deas based on ther common ground (Cramton 1991). We therefore hypothesze dffuson of event news wll be drven by demographc and network factors that constran nteractons due to homophly and network constrants. H1: Access to event news s drven by demographc smlarty, and structural characterstcs of network poston such as betweenness centralty, constrant and path length. On the other hand, nformaton passed back and forth amongst small groups s lkely to be task specfc and relevant to those socally and organzatonally proxmate to the orgnator. At our research ste, snce work groups are organzed vertcally along the organzatonal herarchy, wth teams composed of one member from each organzatonal level, we expect task related nformaton to be passed vertcally up and down the organzatonal herarchy, rather than laterally between members of the same organzatonal level. We hypothesze that dffuson of dscusson topcs, s drven not only by demographc and network factors, but also by project co-work relatonshps and organzatonal herarchy. H2: Access to dscusson topcs s drven by demographc smlarty, and structural characterstcs of network poston such as betweenness centralty, constrant and path length, as well as by task characterstcs and organzatonal herarchy. Informaton Access & Productvty Both nformaton economcs and socal network theory contend that tmely access to novel nformaton can help employees make faster, hgher qualty decsons and mprove ther relatve productvty and performance. Reductons n uncertanty can mprove resource allocatons and decson makng, and reduce delay costs (Cyert & these types nformaton are lkely to dffuse n specfc patterns, and that words exhbtng these patterns proxy for the characterstcs we descrbe. The relatonshp between the types of nformaton descrbed and the dffuson patterns observed s not crtcal. Our goal s to demonstrate that dfferent characterstcs of people, relatonshps, and socal structure affect access to nformaton wth dfferent aggregate dffuson patterns. 7 Whle an mportant dstncton exsts between knowledge and nformaton, we assume characterstcs that make knowledge complex and costly to transfer nfluence characterstcs of nformaton employees n ths frm send and receve. 5

Aral, Brynjolfsson & Van Alstyne March 1963, Galbrath 1973). In our context, precse, tmely nformaton about the canddate pool can reduce tme wasted ntervewng canddates unsutable for a gven search. Tmely nformaton also tempers rsk averson, enablng actors to make approprate decsons faster (Arrow 1962). Reductons n uncertanty help recruters place the rght canddates n front of the rght clents at the rght tme, ncreasng the lkelhood of concludng searches faster, mprovng contract executon per unt tme. Informaton s also valuable due to ts local scarcty. Actors wth scarce, novel nformaton n a gven network neghborhood are better postoned to broker opportuntes, barter for future favors, or apply nformaton to problems that are ntractable gven local knowledge (e.g. Burt 1992). For example, the name of an elgble canddate may enter the communcaton flows of the frm at a certan pont n tme and then dffuse through the organzaton. Recruters who are aware of these novel peces of nformaton, such as the names of potental canddates or news of upcomng layoffs at a source frm, can mprove the effcency wth whch they match canddates to postons. Recevng more novel nformaton sooner should therefore mprove the relatve tmelness and qualty of these matches and ncrease project completon rates and revenue generaton. H3: Project completon and revenue generaton by ndvduals s correlated wth the amount and tmelness of novel nformaton observed by those same ndvduals. Methods Data Data for ths study come from three sources: () accountng data on project co-work relatonshps, organzatonal postons, physcal locatons, projects completed and revenues generated; () emal data captured from the frm s corporate emal server, and () surveys of demographc characterstcs, educaton, and ndustry tenure. Emal data cover 10 months of complete emal hstory over two equal perods from October 1, 2002 to March 1, 2003 and from October 1, 2003 to March 1, 2004. We wrote and developed capture software specfc to ths project and took multple steps to maxmze data ntegrty and levels of partcpaton. New code was tested at Mcrosoft Research Labs for server load, accuracy and completeness of message capture, and securty exposure. To account for dfferences n user deleton patterns, we set admnstratve controls to prevent data expungng for 24 hours. The project went through nne months of human subjects revew pror to launch and content was masked usng cryptographc technques to preserve ndvdual prvacy (Van Alstyne & Zhang 2003). Spam messages were excluded by elmnatng external contacts who dd not receve at least one message from someone nsde the frm. 8 Partcpants receved $100 n exchange for permttng use of ther data, resultng n 87% coverage of recruters elgble to partcpate and more than 125,000 emal messages captured. Detals of data collecton are descrbed by Aral, Brynjolfsson & Van Alstyne (2006). Snce cryptographc technques were used to protect prvacy, we observe unque tokens for every word n the emal data and construct dffuson metrcs based on the movement of words through the organzaton n emal. Survey questons were generated from a revew of relevant lterature and ntervews wth recruters. Experts n survey methods at the Inter-Unversty Consortum for Poltcal and Socal Scence Research vetted the survey nstrument, whch was then pre-tested for comprehenson and ease-of-use. Indvdual partcpants receved $25 for completed surveys and partcpaton exceeded 85%. Identfyng Heterogeneous Informaton Types Our goal s to dentfy event news and dscusson topcs by ther usage characterstcs. We defned event news as smple, declaratve, factual nformaton that s lkely trggered by an external event and of general nterest to many people n the frm. Gven these crtera, we assume event news s characterzed by a spke n actvty and a rapd pervasve dffuson to members of the organzaton, followed by a declne n use. We are also nterested n dentfyng dscusson topcs, whch we defne as more complex, specfc to a group of people, contanng more procedural nformaton and n Von Hppel s (1998) parlance stcky. We expect ths nformaton to exhbt more shallow dffuson, characterzed by back-and-forth conversaton among smaller groups for more extended perods. 9 We began wth a dataset consstng of approxmately 1.5 mllon words whose frequences were dstrbuted accordng to the standard Zpf s Law dstrbuton (see Fgure 1). We elmnated words unlkely to represent dffuson words by cullng extremely rare words (term frequency < 11), words commonly used every week (appearng n at least one emal durng every week of the observaton perod), and words wthout spkes n actvty 8 In ths study we focus on emal sent to and from members of the frm due the dffculty of estmatng accurate socal network structures wthout access to whole network data (see Marsden 1990). 9 We thank Tm Choe for hs treless codng efforts that extracted and manpulated the emal data descrbed n ths secton. 6

Productvty Effects of Informaton Dffuson n Emal Networks (low term frequency- cumulatve nverse document frequency.e. tf-cdf, ) a common measure of usage spkes (Gruhl et. al. 2004)). 10 These three methods reduced the sample to 120,000 words. Table 1: Descrptve Statstcs Varable Obs. Mean SD Mn Max Gender (Male = 1) 832419.50.49 0 1 Age Dfference 562650 12.22 8.81 0 39 Gender Dfference 832419.50.49 0 1 Educaton Dfference 562650 1.38 1.26 0 6 Emal Volume 809613 1474.65 1129.95 0 4496 Strength of Te 832419 11.71 36.90 0 464 Path Length 832419 2.61 2.68 0 10 Geographc Proxmty (Same Offce = 1) 832419.30.46 0 1 Frends n Common 832419 6.70 5.75 0 35 Betweenness Centralty 809613 36.77 36.81 0 165.73 Constrant 809613.213.09 0.51 Pror Project Co-Work 832419.26 1.33 0 19 Industry Tenure Dfference 562650 10.08 8.32 0 38 Same Area Specalty 832419.10.30 0 1 Manageral Level Dfference 832419.86.71 0 2 Partner 832419.36.48 0 1 Consultant 832419.40.48 0 1 Researcher 832419.22.41 0 1 In selectng event news, we sought words wth a spke n actvty and a rapd, pervasve dffuson to members of the organzaton, followed by a declne n use. We chose words seen by more than 30 people wth a coeffcent of varaton one standard devaton above the mean. 11 To select words lkely to dsplay rapd propagaton, of words that reached 30 people, we selected words wth a coeffcent of varaton of actvty one standard devaton above the mean - words wth bursts of actvty n some weeks relatve to others. The coeffcent of varaton has been used n prevous work to dentfy spkes n topc frequency n blog posts (Gruhl et. al. 2004) and s a good measure of dsperson across data wth heterogeneous mean values (Ancona & Caldwell 1992). 12 Observatons of a large number of people suddenly usng a word much more frequently than usual are lkely to ndcate nformaton trggered by some external event that s dffusng through the organzaton. 13 The result s a sample of 3275 words at frst rarely used, then suddenly are used much more frequently and by more than 30 people, followed by a declne n use. We then selected a sample of dscusson topc words where users both receved and sent the word n emal. Ths smple crteron selected approxmately 4100 words from the orgnal canddate set. Examples of the usage characterstcs of event news and dscusson topcs are shown n Fgures 3 & 4. Words n ths sample dsplay a lack of use, followed by a shallow dffuson to a lmted number of people n back and forth dscusson, whch n the case of the word show n Fgure 4 lasts close to 3 months. These words are shared n back and forth conversaton as shown n Fgure 5. After selectng these words based on ther usage characterstcs, we tested whether our nformaton types exhbted sgnfcantly dfferent usage characterstcs and dffuson propertes. As Leskovec, Sngh & Klenberg (2006) have noted, nformaton cascades are typcally shallow, but are sometmes characterzed by large bursts of wde propagaton. We wanted to make sure we captured both these phenomena n our data. We therefore summarzed the usage characterstcs of words along several dmensons ncludng the number of emals contanng the word, the number of people who used the word, the coeffcent of varaton of use, the number of emals per person that contan the word, the total dffuson tme dvded by the total tme n use (as a proxy for use beyond the dffuson to new users), and the maxmum number of people who saw the word for the frst tme n a gven day (a proxy for the maxmum spke n actvty). 10 The tf-cdf constrant chooses words that record a spke n weekly usage greater than three tmes the prevous weekly average, retanng words lkely to cascade or dffuse. The cutoff of 11 produced smlar results as cutoffs n the neghborhood of 11. 11 The dstrbuton of employees usng common words provdes a robust contextual proxy for wdely used nformaton n the frm. Usng a hstogram of common word dstrbuton over the number of people usng those words, we determned that most common words were used by between 30 and 70 people (see Fgure 2). To be conservatve, we selected any word seen by more than 30 people as a potental observaton of event news. 12 The coeffcent of varaton s the standard devaton of the number of emals per week that contan a word dvded by the mean number of emals per week that contan that word. 13 Event drven spkes n use not part of dffuson processes wll downward bas our estmates, makng them more conservatve. 7

Table 2: Correlaton Matrx Measure 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 1. Gender (M = 1) 1.00 2. Age Dfference.06 1.00 3. Gender Dfference.27.06 1.00 4. Educaton Dfference.08.09.06 1.00 5. Emal Volume -.11 -.00 -.04 -.04 1.00 6. Strength of Te -.04 -.06 -.01.01.30 1.00 7. Path Length -.11.00.00 -.04 -.37 -.18 1.00 8. Geographc Proxmty -.06.09 -.01 -.03.06.17 -.06 1.00 9. Common Frends.04 -.02.03.02.50.38 -.40.08 1.00 10. Betweenness Centralty.06.01.01 -.07.66.22 -.31.03.48 1.00 11. Constrant -.18 -.05 -.05.01 -.26 -.06.54 -.05 -.34 -.33 1.00 12. Project Co-Work.02 -.01.01 -.01.01.37 -.09.08.15.02 -.06 1.00 13. Industry Tenure Dfference.11.50.06 -.05 -.09 -.08.03.07 -.07 -.08 -.12.05 1.00 14. Same Area Specalty -.01 -.16 -.00.01.12.40 -.12.27.19.05 -.04.33 -.12 1.00 15. Manageral Level Dfference.05.52.05.11.03 -.10.01 -.03.01.02 -.07.03.50 -.21 1.00 16. Partner.21.06.06.02 -.06 -.05 -.14 -.06.07 -.03 -.31.09.26 -.05.23 1.00 17. Consultant -.12 -.07 -.03 -.04 -.31 -.09.29 -.26 -.19 -.22.19 -.01 -.13 -.07 -.21 -.56 1.00 18. Researcher -.09.01 -.03.03.40.15 -.17.35.13.27.12 -.08 -.13.14 -.01 -.44 -.51 1.00 10 6 Zpf Law n Emal Data Set 10 5 10 4 Word Count 10 3 10 2 10 1 10 0 10 0 10 2 10 4 10 6 10 8 Word Frequency Fgure 1. Dstrbuton of Word Frequences n Emal Fgure 2. Dstrbuton of Common Word Usage 8

60 8 # of people reached, # of emals used 50 40 30 20 10 people reached emal actvty # of people reached, # of emals used 7 6 5 4 3 2 1 people reached emal actvty 0 0 50 100 150 200 250 300 Date n Days 0 0 50 100 150 200 250 300 Date n Days Fgure 3. An Example Event News Item Fgure 4. An Example Dscusson Topc Item We then tested whether words n each category dffered sgnfcantly across these dmensons. T-tests demonstrate that they dffer sgnfcantly across all dmensons of nterest related to ther use and dffuson (see Table 3). Fgure 5. Dscusson Paths n Dscusson Topc Items Table 3: Mean Usage Characterstcs and Dffuson Propertes of Informaton Types Informaton Type News Dscusson t-statstc Usage Characterstcs & Dffuson Propertes Number of Words 3235 4168 - Potental Dffuson Events 245280 320470 - Realzed Dffuson Events 65145 9344 - Number of Emals 236.21 17.69 27.69*** Mean Dffuson Depth 36.31 2.48 213.28*** Coeffcent of Varaton 1.46 4.11 90.53*** Emals Per Person 6.10 7.47 1.105*** Dffuson Tme / Total Use Tme.97.48 66.36*** Maxmum New Users Per Day 9.38 1.60 61.51*** Note: * p <.05, ** p <.01, *** p <.001 Data Structure We observe the dffuson of several thousand words of each nformaton type from the frst occurrence of a gven word n our data, to all employees n our sample. We observe whether a gven employee receved the word, the rank order n whch they receved the word, and the tme between the frst use of the word and the frst nstance of recept of the word by each employee. An observaton s a word-recpent par (one for each possble recpent n the frm). We record dyadc characterstcs of each frst user-recpent par, such as age and ndustry tenure dfferences, for all potental recpents and the ndvdual characterstcs of recpents (e.g. gender, network poston). 9

Aral, Brynjolfsson & Van Alstyne Statstcal Specfcatons We estmate the mpact of hypotheszed factors on the lkelhood of seeng a nformaton and seeng t sooner. Lnear estmates of probablstc outcomes create bas due to non-lnearty at upper and lower bounds of the lkelhoods of dscrete events. They are not well suted to temporal processes n whch outcome varables can be condtoned on prevous events and they also produce based estmates of longtudnal data wth rght censorng (Strang & Tuma 1993). For these reasons we specfy logstc and hazard rate models of dffuson. We frst estmate the nfluence of ndependent varables on the lkelhood of recevng a gven pece of nformaton usng a standard logstc regresson model formalzed n equaton 1, where parameters descrbe the mpact of a gven varable on the lkelhood of recevng the word durng the ten months of emal observaton. P( Y = 1) ln = α + β j X + ε [1]. 1 P( Y = 1) However, pooled cross sectonal estmates may wash away temporal varaton and allow later events to nfluence estmates of earler dffuson (Strang & Tuma 1993). We therefore estmate the rate of recept of dfferent types of nformaton condtonal on havng receved the nformaton, usng a Cox proportonal hazard rate model of the speed wth whch employees receve nformaton: b βx R t) r( t) e ( = [2], where R(t) represents the project completon rate, t s project tme n the rsk set, and r(t) b the baselne completon rate. The effects of ndependent varables are specfed n the exponental power, where β s a vector of estmated coeffcents and X s a vector of ndependent varables. Coeffcents estmate the percent ncrease or decrease n the rate at whch nformaton s seen assocated wth a one unt ncrease n the ndependent varable. Coeffcents greater than 1 represent an ncrease n the rate of nformaton dffusng to the recever (equal to β - 1); coeffcents less than 1 represent a decrease (equal to 1- β ). 14 Fnally, we test the performance mplcatons of access to nformaton dffusng through the network. We test the relatonshp between access to nformaton ( D ) and productvty ( P ), controllng for tradtonal demographc and human captal factors ( HC j ). P = γ + β1 D + B j HC j + ε t [3], j where productvty ( P ) s measured by projects completed and revenues generated durng the perod of emal observaton, and access to nformaton ( D ) s measured by the number of words that were seen by the recruter, the mean rank order n whch they receved words relatve to ther colleagues, the mean tme t took for them to receve words, the number of words for whch they were n the top 10% and the top 50% of recpents by tme, and the number of words they saw n the frst week and the frst month. Human captal and demographc measures HC ) nclude age, gender, educaton, ndustry experence, and organzatonal poston. ( j Results Estmaton of the Dffuson of Informaton We frst tested the dffuson of all types of nformaton through the frm (see Table 3). Although employment at the frm s gender balanced and controllng for correlatons between gender and organzatonal poston (partner and consultant dummes), men are 55% more lkely than women to receve nformaton of all types. Demographc dssmlarty between orgnator and recpent reduces the lkelhood of recevng nformaton by between 1% and 13%, wth gender dfferences recordng the largest mpact and age dfferences the smallest. The strength of tes ncreases the lkelhood of recevng nformaton. Ten addtonal emals sent ncreases the lkelhood of recevng nformaton by 2%. Path length reduces the lkelhood of recevng nformaton, wth each addtonal hop reducng the lkelhood of dffuson by 29%. Havng frends n common seems to reduce the lkelhood of recevng an nformaton cascade. However, havng frends n common s postvely correlated wth emal volume 14 We found no compellng evdence of duraton dependence and proceeded wth tradtonal estmatons of the Cox model. 10

Productvty Effects of Informaton Dffuson n Emal Networks and the strength of tes. Holdng these varables constant, the ntally postve effects of frends n common reduce and reverse. Betweenness centralty has a strong postve effect on the lkelhood of recevng nformaton, as do stronger project co-work relatonshps. Table 3. Drvers of Access to Informaton Model 1 Model 2 Dependent Varable: Word Receved Rate of Recept Specfcaton (Coeffcent Reported) Logstc (Odds Rato) Hazard Model (Hazard Rato) Demography Gender Dummy (Male = 1) 1.551 (.219)*** 1.236 (.167) Age Dfference.986 (.004)***.996 (.004) Gender Dfference.869 (.014)*** 1.009 (.010) Educaton Dfference.906 (.023)***.971 (.020) Geographc Dstance Geographc Proxmty (Same Offce = 1).857 (.088).865 (.078) Task Characterstcs Pror Project Co-Work 1.042 (.016)*** 1.031 (.012)** Industry Tenure Dfference.996 (.006) 1.002 (.006) Same Area Specalty.883 (.080).983 (.067) Organzatonal Herarchy Manageral Level Dfference.951 (.038).997 (.033) Partner Dummy.933 (.188) 1.062 (.168) Consultant Dummy.870 (.184) 1.118 (.207) Descrptve Network Characterstcs Communcaton Volume (Total Emal) 1.0002 (.0002)** 1.000 (.000) Strength of Te 1.002 (.001)*** 1.000 (.000) Path Length.711 (.047)***.828 (.033)*** Frends n Common.954 (.007)***.992 (.005) Betweenness Centralty 1.005 (.002)** 1.004 (.002)** Constrant.212 (.225).326 (.389) Word Type Common Informaton 3.209 (.056)*** 2.292 (.065)*** Dscusson Topcs.081 (.008)***.025 (.002)*** Log Pseudolkelhood -234204.48-1694852.4 Wald χ 2 (d.f.) 6264.80 (19)*** 8878.76 (19)*** Pseudo R 2.28 - Observatons 543308 462422 Notes: Age, Edu, Industry Tenure not sgnfcant. * p <.05; ** p <.01; *** p <.001. Hazard rate model estmates of the drvers of the rate of nformaton recept reveal postve effects for project cowork and betweenness centralty, and a negatve relatonshp between path length and the rate at whch nformaton s receved. These results demonstrate the mportance of demographc dstance, network structure and project based workng relatonshps on the lkelhood of recevng nformaton and the rate at whch t s receved. We note that network characterstcs may be endogenous. Network characterstcs may be correlated wth nformaton dffuson due to the nature of the selecton crtera used to choose words. We therefore nterpret parameters estmates of network characterstcs as descrptve control varables. Estmaton of the Dffuson of Dscusson Topcs & Event News Table 4 presents estmates of the drvers of event news and dscusson topc dffuson. Demographc dstance reduces the lkelhood of recevng both news and dscusson topcs although wth a slghtly larger mpact for news. One addtonal year of educaton dfference between two ndvduals reduces the lkelhood that news wll defuse between them by 7.5%, whle reducng the lkelhood of dscusson topcs dffusng by nearly 17%. Interestngly, men are over 50% more lkely to see news than women although gender has no effect on the lkelhood of the dffuson of dscusson topcs. Strong tes are mportant predctors of the dffuson of dscusson topcs but not of news. News seems to dffuse pervasvely throughout the organzaton wthout regard to the strength of tes nformaton of general nterest s passed through relatvely weak tes as well. Ten addtonal emals 11

Aral, Brynjolfsson & Van Alstyne exchanged ncreases the lkelhood that dscusson topcs wll dffuse by 7% on average. Path length reduces the lkelhood of nformaton dffuson, although the mpact s much larger for dscusson topcs than for news. An addtonal hop between ndvduals reduces the lkelhood of dscusson dffuson by 97%, ndcatng dscusson topcs dffuse locally, whle news travels across multple hops. Betweenness centralty ncreases the lkelhood of seeng both news and dscusson topcs. Table 4. Drvers of Access to Dscusson Topcs & Event News NEWS DISCUSSION NEWS DISCUSSION Model 1 Model 2 Model 3 Model 4 Dependent Varable: Word Receved Word Receved Rate of Recept Rate of Recept Specfcaton (Coeffcent) Logstc (Odds Rato) Logstc (Odds Rato) Hazard Model (Hazard Rato) Hazard Model (Hazard Rato) Demography Gender (Male=1) 1.544 (.227)*** 1.073 (.137) 1.332 (.228)* 1.075 (.162) Age Dfference.992 (.004)**.981 (.007)***.998 (.004).994 (.007) Gender Dfference.902 (.017)***.814 (.069)** 1.007 (.012) 1.092 (.110) Educaton Dfference.925 (.022)***.832 (.034)***.966 (.024) 1.013 (.037) Geographc Dstance Geographc Proxmty (Collocate = 1).883 (.090).929 (.106).879 (.097).993 (.115) Task Characterstcs Pror Project Co-Work 1.010 (.014) 1.080 (.0185)*** 1.018 (.016) 1.066 (.018)*** Industry Tenure Dfference.996 (.006).978 (.008)**.999 (.008).999 (.008) Same Area Specalty.933 (.073) 1.038 (.139).981 (.078) 1.795 (.252)*** Organzatonal Herarchy Manageral Level Dfference.963 (.035) 1.138 (.079)*.992 (.037) 1.097 (.089) Partner Dummy.856 (.186) 1.515 (.271)** 1.084 (.216) 1.411 (.232)** Consultant Dummy.798 (.177) 1.659 (.262)*** 1.221 (.289) 1.749 (.288)*** Descrptve Network Characterstcs Emal Volume 1.0001 (.00007)* 1.0001 (.0001)* 1.0001 (.000) 1.0001 (.000)** Strength of Te 1.000 (.000) 1.007 (.001)***.999 (.000) 1.006 (.001)*** Path Length.732 (.041)***.029 (.005)***.814 (.044)***.310 (.045)*** Frends n Common.972 (.005)***.877 (.012)***.992 (.007).969 (.012)** Betweenness Centralty 1.004 (.002)* 1.007 (.002)** 1.006 (.002)** 1.002 (.002) Constrant.186 (.213) 2.243 (2.651).282 (.410) 1.664 (1.698) Log Pseudolkelhood -93273.148-15167.79-508288.77-28166.432 Wald χ 2 (d.f.) 204.39 (17) *** 2816.61 (17)*** 92.80 (17)*** 762.33 (17)*** Pseudo R 2.06.54 - - Observatons 163135 202500 120197 196541 Notes: Age, Edu, Industry Tenure not sgnfcant. * p <.05; ** p <.01; *** p <.001. Perhaps most nterestngly, strong workng relatonshps and smlarty n ndustry tenure both have strong postve mpacts on the lkelhood of recevng dscusson topcs, but not on the dffuson of news. Each addtonal project that two people work on together ncreases the lkelhood that dscusson dffuses between them by 8%. Dscusson topcs are more lkely to dffuse up and down the organzatonal herarchy rather than laterally. As researcher s the omtted poston category, strong postve estmates on partner and consultant varables demonstrate that dscusson s more lkely to dffuse upward rather than down the herarchcal structure of the frm. Hazard rate analyses mrror the logstc regresson results to a large extent. Men see news at a hgher rate than women, although demographc dfferences do not seem to predct the rate at whch ndvduals see ether news or dscusson topcs. The strength of tes has a strong postve mpact on the hazard rate for dscusson topcs but not for news, whle greater path lengths consstently reduce the hazard rate across both types of nformaton. We see ncreases n the rate at whch employees see dscusson topcs wth greater project co-work (6.6% ncrease per addtonal project). Havng the same area of expertse ncreases the rate whle ndustry tenure dfferences have no effect. The partner and consult dummes show that employees n the top two levels of the organzaton see nformaton at a hgher rate. Access to Informaton & Productvty Table 5 presents estmates of the mpact of access to nformaton on the productvty of ndvdual recruters as measured by the number of projects completed. Each measure of access to nformaton captures a 12

Productvty Effects of Informaton Dffuson n Emal Networks partcular dmenson of the degree to whch recruters are prvy to nformaton dffusng through the emal network. Words seen s a count of the number of words each recruter receved n emal. Mean rank measures the rank order of recept for each word relatve to other recruters. Mean tme measures the average tme n days t takes recruters to see words. Rank 10% (50%) measures the number of words for whch recruters were n the frst 10% (50%) of employees to see the word. Words seen n one week (month) measures how many words the recruter sees wthn one week (month). Access to nformaton predcts project output. Each addtonal ten words seen are assocated wth an addtonal 1% of one project completed. Greater mean rank and longer average recept tmes are assocated wth fewer projects completed holdng constant tradtonal demographc and human captal varables. Table 5. Informaton Dffuson & Project Completons Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Model 7 Dependent Completed Completed Completed Completed Completed Completed Completed Varable: Projects Projects Projects Projects Projects Projects Projects Age.015.010.006.027.021.054.201 (.066) (.063) (.063) (.068) (.067) (.060) (.065) Gender -1.115-1.119* -1.176* -1.367* -1.133-1.141-1.349* (.699) (.632) (.634) (.789) (.712) (.770) (.782) Educaton.066.162.153 -.011.068.039 -.002 Industry Experence Partner Consultant Words Seen Mean Rank Mean Tme Rank 10% Rank 50% Words Seen In 1 Week Words Seen In 1 Month Constant (.320) -.029 (.061) 1.335 (1.627) 1.592 (1.079).001*** (.0003) (.289) -.012 (.059) 1.508 (1.530) 1.832* (.952) -.225*** (.041) (.296) -.009 (.060) 1.596 (1.536) 1.857* (.962) -.132*** (.023) (.319) -.016 (.057) 2.491 (1.912) 2.479 (1.583).004*** (.001) (.321) -.026 (.061) 1.397 (1.680) 1.660 (1.151).002*** (.0003) (.303) -.032 (.053) 2.456 (1.839) 2.417 (1.545).008*** (.002) (.318) -.021 (.059) 2.361 (1.816) 2.198 (1.473).003*** (.001) -.446 (5.998) -1.597 (5.674) 13.858** (5.179) 17.268*** (5.369) -.069 (6.109) -1.349 (5.768) -2.464 (6.171) F-Value (d.f.) 5.13*** (7) 6.73*** (7) 7.07*** (7) 2.94** (7) 4.28*** (7) 3.16** (7) 3.37*** (7) R 2.39.43.44.25.36.27.29 Obs. 41 41 41 41 41 41 41 Note: * p <.10; ** p <.05; *** p <.01 Table 6 presents relatonshps between access to nformaton dffuson and revenues generated, a qualty adjusted measure of output. These results show economcally sgnfcant relatonshps wth addtonal word seen assocated wth about $70 of addtonal revenue generated. Strkngly, access to nformaton dffuson s a much stronger predctor of productvty than tradtonal human captal varables such as educaton or ndustry experence. Concluson We demonstrate that demography, organzatonal structure, and task characterstcs all nfluence the dffuson of nformaton and the lkelhood of nvolvement n nformaton cascades. We also fnd that dfferent types of nformaton dffuse dfferently. Whle demographc dstance reduces the lkelhood of seeng both types of nformaton, task characterstcs such as project co-work and ndustry tenure dfferences reduce the lkelhood of recevng dscusson nformaton more than event news. Dscusson topcs are more lkely to dffuse vertcally up 13

Aral, Brynjolfsson & Van Alstyne and down the organzatonal herarchy, across relatonshps wth a pror workng hstory, and across stronger tes, whle news s more lkely to dffuse laterally as well as vertcally, and wthout regard to the strength or functon of relatonshps. These dfferences also strongly correlate wth productvty. Informaton workers who receve more novel nformaton sooner complete projects faster and generate sgnfcantly more revenue for the frm. Our fndngs provde some of the frst evdence of the economc sgnfcance of nformaton dffuson n networks. Table 6. Informaton Dffuson & Revenues Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Model 7 Dependent Varable: Total Revenues Total Revenues Total Revenues Total Revenues Total Revenues Total Revenues Total Revenues Age 1127.36 888.59 720.11 1812.38 1414.45 2846.80 1525.45 (2821.64) (2684.60) (2676.23) (3079.27) (2884.71) (2820.33) (2935.73) Gender -65152.48* -65387.82* -67740.8* -70968.47* -65451.9* -64268.99-71504.52* (36796.11) (34113.54) (34320.92) (41507.65) (37780.24) (41860.22) (41663.96) Educaton -3340.51 1052.76 453.83-9337.38-3653.47-6093.10-8428.63 Industry Experence Partner Consultant Words Seen Mean Rank Mean Tme Rank < 10% Rank < 50% Words Seen In 1 Week Words Seen In 1 Month Constant (13410.84) -2517.68 (2771.90) 121600.4 (77138.46) 61777.68 (61463.41) 70.52*** (15.61) (12231.44) -1744.85 (2755.45) 129394.5* (70803.72) 72674.13 (55064.94) -10202.88*** (1992.77) (12489.27) -1599.66 (2766.55) 133607.9* (71045.74) 73515.88 (55743.18) -5931.05*** (1130.32) (13878.34) -2061.68 (2749.58) 171580.1* (96160.11) 91727.37 (87988.3) 152.07** (58.76) (13658.48) -2365.72 (2789.73) 125243.7 (81137.58) 64306.19 (66203.77) 64.93*** (16.16) (14103.16) -2648.24 (2630.66) 171220.7* (91832.3) 93837.54 (84155.48) 321.50*** (114.98) (13741.45) -2222.38 (2771.76) 167003.2* (90702.31) 82969.67 (82146.35) 114.96*** (38.76) 166804.6 (272595.9) 64973.45 (247744.40) 765031.8*** (22344.2) 915736.5*** (231192.6) 195308.1 (276691.3) 85886.32 (255321.6) 68776.88 (290924.2) F-Value (d.f.) 4.46*** (7) 5.39*** (7) 5.54*** (7) 2.64** (7) 3.77*** (7) 3.56*** (7) 2.83** (7) R 2.39.42.42.24.36.27.27 Obs. 41 41 41 41 41 41 41 References Allen, T. J. 1977. Managng the flow of technology. Cambrdge, MA, MIT Press. Ancona, D.G. & Caldwell, D.F. 1992. Demography & Desgn: Predctors of new Product Team Performance. Organzaton Scence, 3(3): 321-341. Aral, S., Brynjolfsson, E., & Van Alstyne, M. 2006. Informaton, Technology and Informaton Worker Productvty: Task Level Evdence. Intl. Conference on Informaton Systems, Mlwaukee, Wsconsn. Arrow, K.J. 1962. The Economc Implcatons of Learnng by Dong. Rev. Econ. Stud. (29:3): 155-173. Blau, Peter (1977). Inequalty and Heterogenety. New York: Free Press. Bulkley, N., Van Alstyne, M. 2005. Why Informaton Should Influence Productvty n Network Socety: A Cross Cultural Perspectve, Manuel Castells (Ed.), Edward Elgar, Northampton, MA, 145-173. Burt, R. 1992. Structural Holes: The Socal Structure of Competton. Harvard Unversty Press, Cambrdge, MA. Burt, R. 2002. Brdge decay. Socal Networks, 24(4): 333-363. Burt, R. 2004. Structural holes & good deas. Amercan Journal of Socology, (110): 349-99. Buskens, V. & K. Yamaguch. 1999. A new model for nformaton dffuson n heterogeneous socal networks. Socologcal Methodology, 29: 281-325. 14

Productvty Effects of Informaton Dffuson n Emal Networks Cohen, W. & D. Levnthal. 1990. Absorptve Capacty: A new perspectve on learnng and nnovaton. Admnstratve Scence Quarterly, 35: 128-152. Cohen, W. & P. Bacdayan. 1994. Organzatonal routnes are stored as procedural memory: Evdence from a laboratory study. Organzaton Scence, 5(4): 554-568. Cramton, C.D. 2001. The mutual knowledge problem and ts consequences for dspersed collaboraton. Organzaton Scence, 12(3): 346-371. Cummngs, J., & Cross, R. 2003. Structural propertes of work groups and ther consequences for performance. Socal Networks, 25(3):197-210. Cyert, R.M., March, J.G. 1963. A Behavoral Theory of the Frm, Malden, MA, Blackwell Publshers. Dellarocas, C. 2003. The dgtzaton of word of mouth: Promse and challenges of onlne feedback mechansms. Management Scence 49(10): 1407-1424. Domngos, P., & M. Rchardson. 2001. Mnng the network value of customers Proceedngs of the 7th ACM SIGKDD Internatonal Conference on Knowledge Dscovery and Data Mnng, San Francsco, CA: 57-66. Freeman, L. 1979. Centralty n socal networks: Conceptual clarfcaton. Socal Networks, 1(3): 215-234. Galbrath, J.R. 1973. Desgnng Complex Organzatons. Readng, MA, Addson-Wesley. Greve, H. e.t al.. 2001. Estmaton of dffuson models from ncomplete data. Soc. Methods & Research, 29: 435. Granovetter, M. 1978. Threshold models of collectve behavor. Amercan Journal of Socology 83(6):1420-1443. Gruhl, et al., 2004. Informaton dffuson through blogspace, Proceedngs of 13th Int l Conf. on WWW. NY. Hansen, M. 1999. "The search-transfer problem: The role of weak tes n sharng knowledge across organzaton subunts." Admnstratve Scence Quarterly (44:1):82-111. Hansen, M. 2002. "Knowledge networks: Explanng effectve knowledge sharng n multunt companes." Organzaton Scence (13:3): 232-248. Hrshlefer, D., Subrahmanyam, A., & T., Sherdan. 1994. Securty Analyss and Tradng Patterns when Some Investors Receve Informaton Before Others Journal of Fnance, 49(5): 1665-1698. Jackson, M. & L Yarv, Econome Publque Numero 16, pp 3-16, 2005/1. Kempe,D., Klenberg, J., & E. Tardos. 2003. Maxmzng the spread of nfluence through a socal network Proceedngs of the 9th ACM SIGKDD, Washngton, D.C.: 137-146. Lescovec, J., Sngh, A., & J. Klenberg. 2006. Patterns of nfluence n a recommendaton network. Pacfc-Asa Conference on Knowledge Dscovery & Data Mnng (PAKDD). McPherson, M., L. Smth-Lovn & J. Cook. 2001. Brds of a Feather: Homophly n Socal Networks. Annual Revew of Socology 27: 415-444. Mntzberg, H. 1979. The Structurng of Organzatons, Prentce-Hall, Englewood Clffs, NJ. Nelson, R. What s publc and what s prvate about technology? Workng Paper 9-90, Center for Research n Management, Unversty of Calforna Berkeley. Newman, M., Forrest, S. & J. Balthrop. 2002 Emal networks and the spread of a computer vrus. Phys. Rev. E., 66, 035101. Pfeffer, J. 1983. Organzatonal Demography, n Larry L. Cummngs and Barry M. Staw (eds.), Research n Organzatonal Behavor, 5: 299-257. JAI Press, Greenwch, CT. Reagans, R. & McEvly, B. 2003. Network Structure & Knowledge Transfer: The Effects of Coheson & Range. Admnstratve Scence Quarterly, (48): 240-67. Reagans, R. & Zuckerman, E. 2001. "Networks, dversty, and productvty: The socal captal of corporate R&D teams." Organzaton Scence (12:4): 502-517. Reagans, R. & Zuckerman, E. 2006. "Why Knowledge Does Not Equal Power: The Network Redundancy Tradeoff" Workng Paper Sloan School of Management 2006, pp. 1-67. Rodgers, E. 1995. The Dffuson of Innovatons. The Free Press, New York. Schellng, T.C. 1978 Mcromotves & Macrobehavor. George J. McLeod Ltd. Toronto. Strang, D. & N. Tuma. 1993. Spatal and temporal heterogenety n dffuson Amer. Jrnl Soc., 99(3): 614. Tuma, N.B., & Hannan, M.T.1984. Socal Dynamcs: Models and Methods. Academc Press, New York. Uzz, B. 1997. Socal structure and competton n nterfrm networks: The paradox of embeddedness. Admnstratve Scence Quarterly, 42: 35-67. Van Alstyne, M. Zhang J. 2003. EmalNet: A System for Automatcally Mnng Socal Networks from Organzatonal Emal Communcaton, NAACSOS; Pttsburgh: Carnege Mellon. Von Hppel, E. 1998. "Economcs of Product Development by Users: The Impact of "Stcky" Local Informaton" Management Scence (44:5): 629-644. Wu, F. Huberman, B., Adamc, L., & J. Tyler. 2004. Informaton flow n socal groups. Physca A: Statstcal and Theoretcal Physcs, 337(1-2): 327-335. 15