Modeling and Predicting Popularity Dynamics via Reinforced Poisson Processes



Similar documents
State of Louisiana Office of Information Technology. Change Management Plan

Risk Management for Derivatives

Optimal Control Policy of a Production and Inventory System for multi-product in Segmented Market

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 14 10/27/2008 MOMENT GENERATING FUNCTIONS

Ch 10. Arithmetic Average Options and Asian Opitons

The one-year non-life insurance risk

Digital barrier option contract with exponential random time

Consumer Referrals. Maria Arbatskaya and Hideo Konishi. October 28, 2014

On Adaboost and Optimal Betting Strategies

Product Differentiation for Software-as-a-Service Providers

Stock Market Value Prediction Using Neural Networks

Improving Direct Marketing Profitability with Neural Networks

Cross-Over Analysis Using T-Tests

Data Center Power System Reliability Beyond the 9 s: A Practical Approach

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 12, June 2014

JON HOLTAN. if P&C Insurance Ltd., Oslo, Norway ABSTRACT

Optimal Energy Commitments with Storage and Intermittent Supply

A Theory of Exchange Rates and the Term Structure of Interest Rates

Seeing the Unseen: Revealing Mobile Malware Hidden Communications via Energy Consumption and Artificial Intelligence

A Generalization of Sauer s Lemma to Classes of Large-Margin Functions

An intertemporal model of the real exchange rate, stock market, and international debt dynamics: policy simulations

10.2 Systems of Linear Equations: Matrices

How To Segmentate An Insurance Customer In An Insurance Business

Forecasting and Staffing Call Centers with Multiple Interdependent Uncertain Arrival Streams

Heat-And-Mass Transfer Relationship to Determine Shear Stress in Tubular Membrane Systems Ratkovich, Nicolas Rios; Nopens, Ingmar

Option Pricing for Inventory Management and Control

MSc. Econ: MATHEMATICAL STATISTICS, 1995 MAXIMUM-LIKELIHOOD ESTIMATION

Detecting Possibly Fraudulent or Error-Prone Survey Data Using Benford s Law

Performance And Analysis Of Risk Assessment Methodologies In Information Security

Cost Efficient Datacenter Selection for Cloud Services

ThroughputScheduler: Learning to Schedule on Heterogeneous Hadoop Clusters

Modelling and Resolving Software Dependencies

An Introduction to Event-triggered and Self-triggered Control

The most common model to support workforce management of telephone call centers is

Enterprise Resource Planning

Optimal Control Of Production Inventory Systems With Deteriorating Items And Dynamic Costs

INFLUENCE OF GPS TECHNOLOGY ON COST CONTROL AND MAINTENANCE OF VEHICLES

Unsteady Flow Visualization by Animating Evenly-Spaced Streamlines

A Data Placement Strategy in Scientific Cloud Workflows

Automatic Long-Term Loudness and Dynamics Matching

A New Pricing Model for Competitive Telecommunications Services Using Congestion Discounts

A Blame-Based Approach to Generating Proposals for Handling Inconsistency in Software Requirements

Mandate-Based Health Reform and the Labor Market: Evidence from the Massachusetts Reform

Hull, Chapter 11 + Sections 17.1 and 17.2 Additional reference: John Cox and Mark Rubinstein, Options Markets, Chapter 5

A New Evaluation Measure for Information Retrieval Systems

Lagrangian and Hamiltonian Mechanics

Nonparametric Estimation of State-Price Densities Implicit in Financial Asset Prices

Towards a Framework for Enterprise Architecture Frameworks Comparison and Selection

Unbalanced Power Flow Analysis in a Micro Grid

MODELLING OF TWO STRATEGIES IN INVENTORY CONTROL SYSTEM WITH RANDOM LEAD TIME AND DEMAND

A Monte Carlo Simulation of Multivariate General

FAST JOINING AND REPAIRING OF SANDWICH MATERIALS WITH DETACHABLE MECHANICAL CONNECTION TECHNOLOGY

Minimum-Energy Broadcast in All-Wireless Networks: NP-Completeness and Distribution Issues

Mathematics Review for Economists

Web Appendices of Selling to Overcon dent Consumers

Characterizing the Influence of Domain Expertise on Web Search Behavior

How To Predict A Call Capacity In A Voip System

How To Find Out How To Calculate Volume Of A Sphere

Interference Mitigation Techniques for Spectral Capacity Enhancement in GSM Networks

Mathematical Models of Therapeutical Actions Related to Tumour and Immune System Competition

Differentiability of Exponential Functions

RUNESTONE, an International Student Collaboration Project

The mean-field computation in a supermarket model with server multiple vacations

Risk Adjustment for Poker Players

Math , Fall 2012: HW 1 Solutions

Dynamic Network Security Deployment Under Partial Information

Definition of the spin current: The angular spin current and its physical consequences

Energy Cost Optimization for Geographically Distributed Heterogeneous Data Centers

Net Neutrality, Network Capacity, and Innovation at the Edges

Isothermal quantum dynamics: Investigations for the harmonic oscillator

An introduction to the Red Cross Red Crescent s Learning platform and how to adopt it

Professional Level Options Module, Paper P4(SGP)

Chapter 9 AIRPORT SYSTEM PLANNING

A NATIONAL MEASUREMENT GOOD PRACTICE GUIDE. No.107. Guide to the calibration and testing of torque transducers

11 CHAPTER 11: FOOTINGS

1 Introduction to the Recommendations and their Application Principles

Predicting Information Popularity Degree in Microblogging Diffusion Networks

Dow Jones Sustainability Group Index: A Global Benchmark for Corporate Sustainability

Debt cycles, instability and fiscal rules: a Godley-Minsky model

GPRS performance estimation in GSM circuit switched services and GPRS shared resource systems *

Feedback linearization control of a two-link robot using a multi-crossover genetic algorithm

Improving Emulation Throughput for Multi-Project SoC Designs

CALCULATION INSTRUCTIONS

Transcription:

Proceeings of the Twenty-Eighth AAAI Conference on Artificial Intelligence Moeling an Preicting Popularity Dynamics via Reinforce Poisson Processes Huawei Shen 1, Dashun Wang 2, Chaoming Song 3, Albert-László Barabási 4 1 Institute of Computing Technology, Chinese Acaemy of Sciences, Beijing 100190, China 2 IBM Thomas J. Watson Research Center, Yorktown Heights, New York 10598, USA 3 Department of Physics, University of Miami, Coral Gables, Floria 33146, USA 4 Center for Complex Network Research, Northeastern University, Boston, Massachusetts 02115, USA Email: shenhuawei@ict.ac.cn Abstract An ability to preict the popularity ynamics of iniviual items within a complex evolving system has important implications in an array of areas. Here we propose a generative probabilistic framework using a reinforce Poisson process to explicitly moel the process through which iniviual items gain their popularity. This moel istinguishes itself from existing moels via its capability of moeling the arrival process of popularity an its remarkable power at preicting the popularity of iniviual items. It possesses the flexibility of applying Bayesian treatment to further improve the preictive power using a conjugate prior. Extensive experiments on a longituinal citation ataset emonstrate that this moel consistently outperforms existing popularity preiction methos. Introuction Information explosion, from knowlege atabase to online meia, places attention economy in the center of this era. In the heart of attention economy lies a competing process through which a few items become popular while most are forgotten over time (Wu an Humberman 2007). For example, vieos on YouTube or stories on Digg gain their popularity by striving for views or votes (Szabo an Huberman 2010); papers increase their visibility by competing for citations from new papers (Ren et al. 2010; Wang, Song, an Barabási 2013); tweets or Hashtags in Twitter become more popular as being retweete (Hong, Dan, an Davison 2011) an so o webpages as being attache by incoming hyperlinks (Ratkiewicz et al. 2010). An ability to preict the popularity of iniviual items within a ynamically evolving system not only probes our unerstaning of complex systems, but also has important implications in a wie range of omains, from marketing an traffic control to policy making an risk management. Despite recent avances of empirical methos, we lack a general moeling framework to preict the popularity of iniviual items within a complex evolving system. Current moels fall into two main paraigms, each with known strengths an limitations. One focuses on reproucing certain statistical quantities over an aggregation of items (Barabási an Albert 2005; Kempe, Kleinberg, an Taros 2003; Backstrom et al. 2006; Dezso et al. 2006; Copyright c 2014, Association for the Avancement of Artificial Intelligence (www.aaai.org). All rights reserve. Crane an Sornette 2008; Ratkiewicz et al. 2010). These moels have been successful in unerstaning the unerlying mechanisms of popularity ynamics. Yet, as they o not provie a way to extract item-specific parameters, these moels lack preictive power for the popularity ynamics of iniviual items. The other line of enquiry, in contrast, treats the popularity ynamics as time series, making preictions by either exploiting temporal correlations (Szabo an Huberman 2010; Yang an Leskovec 2010; Lerman an Hogg 2010; Yan et al. 2011; Yu et al. 2012; Bao et al. 2013b) or fitting to these time series certain classes of functions (Bass 1969; Mahajan, Muller, an Bass 1990; Vu et al. 2011; Matsubara et al. 2012; Lerman an Hogg 2012; Gomez-Roriguez, Leskovec, an Schölkopf 2013; Yang an Zha 2013). Despite their initial success in certain omains, these moels are eterministic, moeling the popularity ynamics in a mean-fiel, if heuristic, fashion by focusing on the average amount of attentions receive within a fixe time winow, ignoring the unerlying arrival process of attentions. Inee, to the best of our knowlege, we lack a probabilistic framework to moel an preict the popularity ynamics of iniviual items. The reason behin this is partly illustrate in Figure 1, suggesting that the ynamical processes governing iniviual items appear too noisy to be amenable to quantification. In this paper, we moel the stochastic popularity ynamics using reinforce Poisson processes, capturing simultaneously three key ingreients: fitness of an item, characterizing its inherent competitiveness against other items; a general temporal relaxation function, corresponing to the aging in the ability to attract new attentions; an a reinforcement mechanism, ocumenting the well-known rich-get-richer phenomenon. The benefit of the propose moel is threefol: (1) It moels the arrival process of iniviual attentions irectly in contrast to relying on aggregate popularity time series; (2) As a generative probabilistic moel, it can be easily incorporate into the Bayesian framework to account for external factors, hence leaing to improve preictive power; (3) The flexibility in its choice of specific relaxation functions makes it a general framework that can be aapte to moel the popularity ynamics in ifferent omains. Taking citation system as an exemplary case, we emonstrate the effectiveness of the propose framework using a ataset peculiar in its longituinality, spanning over 100 291

Citations 120 100 80 60 40 20 0 25 30 Years after publication (a) Citations Frequency 300 250 200 150 100 50 0 25 30 Hours after being poste (b) Hashtags Figure 1: Stochastic Popularity ynamics. (a) 20 papers ranomly selecte from Physical Review uring 1960s. (b) 20 Hashtags ranomly selecte from Twitter in 2012. years an containing all the papers ever publishe by American Physical Society. We fin the propose moel consistently outperforms competing methos. Moreover, the propose moel is general. Hence it is not limite to preicting citations, but with appropriate ajustments will likely apply to other omains riven by competing processes. Reinforce Poisson Process The popularity ynamics of iniviual item uring time perio [0,T] is characterize by a set of time moments {t i }(1 apple i apple n ) when each attention is receive, where n represents the total number of attentions. Without loss of generality, we have 0 = t 0 apple t 1 apple apple t i apple apple t n apple T. To moel the arrival process of {t i }, we consier two major phenomena confirme inepenently in previous stuies of population ynamics: (1) the reinforcement capturing the rich-get-richer mechanism, i.e., previous attention triggers more subsequent attentions (Crane an Sornette 2008); (2) the aging effect characterizing time-epenent attractiveness of iniviual items (Ulrich an Miller 1993; Wang, Song, an Barabási 2013). Taken these two factors together, for an iniviual item, we moel its popularity ynamics as a reinforce Poisson process (RPP) (Pemantle 2007) characterize by the rate function x (t) as x (t) = f (t; )i (t), (1) where is the intrinsic attractiveness, f (t; ) is the relaxation function that characterizes the temporal inhomogeneity ue to the aging effect moulate by parameters, an i (t) is the total number of attentions receive up to time t. From a Bayesian viewpoint, the total number of attentions i (t) is the sum of the number of real attentions an the effective number of attentions which plays the role of prior belief. Here, we assume that all items are create equal an hence the effective number of attentions for all items has the same value, enote by m. Therefore uring the time interval between the (i 1)th an ith attentions, we have i (t) =m + i 1, (2) where 1 apple i apple n. Accoringly, uring the time interval between the n th attention an T, the total number of attentions is m + n. The length of time interval between two consecutive attentions follows an inhomogeneous Poisson process. Therefore, given that the (i 1)th attention arrives at t i 1, the probability that the ith attention arrives at t i follows p 1 (t i t i 1) = f (t i ; )(m + i 1)! " p 1 p 0 t t ií1 t 1 i tn T 0 ií2 ií1 n í1 n Figure 2: Graphical representation of the generative moel for popularity ynamics via reinforce Poisson process. e R t i t i 1 f (t; )(m+i 1)t, (3) an the probability that no attention arrives between t n an T is p 0 (T t n ) = e R T t n f (t; )(m+n )t. (4) Incorporating Eqs. (3) an (4) with the fact that attentions uring ifferent time intervals are statistically inepenent, the likelihoo of observing the popularity ynamics {t i } uring time interval [0,T] follows Yn L(, ) = p 0 (T t n ) p 1 (t i t i 1) = n Yn (m + i 1)f (t i ; ) e ((m+n )F (T ; ) P n F (t i ; )), (5) where F (t; ) R t 0 f (t; )t an we have reorganize the terms on the exponent for simplicity. For clarity, we illustrate the propose RPP moel in the graphical representation (Figure 2). By maximizing the likelihoo function in Eq. (5), we obtain the most likely fitness parameter for item in close form: n = (m + n )F (T ; ) P n F (t i ; (6) ). The solution for epens on the specific form of relaxation function f (t; ). We save the iscussions about the estimation of for later. Next we show that, with the obtaine an, the moel can be use to preict the expecte number c (t) of attentions gathere by item up to any given time t. Inee, accoring to Eq. (1), for t T, this preiction task is equivalent to the following ifferential equation c (t) t = f (t; )(m + c (t)) (7) with the bounary conition c (T )=n. Solving this ifferential equation, we obtain the preiction function c (t) =(m + n )e F (t; ) F (T ; ) m. (8) 292

$ #! :1 N Figure 3: Probabilistic graphical moel for reinforce Poisson process with conjugate prior. Reinforce Poisson Process with prior Maximum likelihoo parameter estimation suffers from the overfitting problem for small sample size. For example, Eq. (6) gives =0when n =0, an results in a null forecasting of future popularity, i.e., c (t) =0at any future time t. Moreover, the exponential epenency of c (t) on in Eq. (8) leas to a large uncertainty in the preiction of c (t). In this section, to overcome the rawback of the parameter estimation in Eq. (6), we aopt the Bayesian treatment for popularity preiction by introucing a conjugate prior for the fitness parameter, leaing to a further improvement of the preiction accuracy of the propose RPP moel. The likelihoo function in Eq. (5) is a prouct of a power function an an exponential function of. Therefore, the conjugate prior for follows the gamma istribution p(, )= ( ) t " 1 e. (9) Note that this conjugate prior is the prior istribution of fitness parameters for all N items rather than for certain iniviual item. Hereafter, for convenience, we use ~ t {t i } to enote all the arrival time of attentions gathere by item. After introucing the conjugate prior, the graphical representation of moel is epicte in Figure 3. Using Bayes theorem an combining Eqs. (5) an (9), we obtain the posterior istribution of p( ~ t,,, ) = p( t ~, )p(, ) R p( t ~, )p(, ) = ( + X) +n +n 1 e ( +X), (10) ( + n ) P n where X (m + n )F (T ; ) F (t i ; ). With the obtaine posterior istribution of, the expecte number of attentions c (t), as shown in Eq. (8), can be preicte using its mean over the posterior istribution as Z hc (t)i = c (t)p( t ~,,, ) = (m + n ) +n + X + X Y m,(11) where Y F (t; ) F (T ; ). When! inf, the preiction function reuces to a naive metho, i.e., preicting that the popularity keeps constant in future. Eq. (11) is the Bayesian preiction function, preicting c (t) using the posterior istribution of instea of using a single value of obtaine by maximum likelihoo estimation. Neither X, corresponing to empirical observations, nor Y, reflecting the rate ifference in reinforce Poisson process, is in the exponent, inicating the robustness of this preiction function. We now iscuss how to etermine the parameters an of prior istribution. In principle, the values of prior parameters coul be tune by checking the accuracy of preiction function with respect to prior parameters on so-calle valiation set. Yet, this requires us to know the future popularity of some items to etermine prior parameters, hence may not be practical in scenarios where such information is not available. One alternative solution is the fully Bayesian approach which introuces hyperprior for prior parameters. Although the fully Bayesian approach is theoretically elegant, the inference of prior parameters is intractable in most cases. Approximation methos or Monte Carlo methos have to be aopte. As a result, the benefit of fully Bayesian approach is iscounte by approximation gap in approximation methos or high computational cost of Monte Carlo methos. In this paper, we etermine the value of prior parameters by aopting maximum likelihoo estimation with latent variable. Specifically, we choose the an values that maximize the following logarithmic likelihoo function L(, ) = =1 Z ln p( ~ t )p(, ). (12) Here, is not explicitly written to keep the notation uncluttere. In sum, an are obtaine accoring to where @L(, ) @ @L(, ) @ = = N N X = N(ln 0( )) + + =1, (13) ln + n =1 0( + n ), (14) 0 is the igamma function an the latent variable is + n +(m + n )F (T ; ) P n F (t i ; ). (15) Comparing Eq. (15) an Eq. (6), we can see that the fitness parameter is ajuste by prior parameters an. Note that the parameters for all items are also etermine by maximizing the likelihoo function in Eq. (12). The calculation epens on the specific form of relaxation function f (t; ), which is given in experiments on real ataset. Experiments In this section, we emonstrate the effectiveness of the propose RPP moel, with an without prior. 293

Experiment setup Dataset. We conuct experiments on an excellent longituinal ataset, containing all papers an citations publishe by American Physical Society between 1893 an 2009. We choose this ataset for two main reasons: (1) It covers an extene perio of time, spanning 117 years, ieal for moeling an preicting temporal ynamics; (2) Treating papers as items, their popularity is well-efine, characterize by citations. Statistics about this ataset are shown in Table 1. Relaxation function. When formalizing the moel for popularity ynamics, we introuce a general relaxation function f (t; ) an skippe the iscussion of parameter. Here, when applying this moel to a specific case, i.e., to citation system, we nee to etermine the specific form of the relaxation function as well as. Previous stuies (Raicchi, Fortunato, an Castellano 2008; Wang, Song, an Barabási 2013) on citation ynamics suggest that the aging of papers is capture by a log-normal relaxation function 1 (ln t f (t; µ, ) = p 2 t exp µ ) 2, (16) 2 2 a common relaxation function, which is also observe in other omains such as messages in microblogging networks (Bao et al. 2013a). For item with log-normal relaxation function, is replace by parameters µ an, which can be calculate by maximizing the logarithmic likelihoo L in Eq. (12) an Eq. (5) for the propose RPP moel with an without prior, respectively. In this paper, we maximize logarithmic likelihoo using optimization methos which leverage graients @L @µ = 1 ( n X h i i ( i ) + (n + m) ( ) ), (17) ( @L = 1 X n h i i i i ( i ) @ + (n + m) ( ) n ), (18) where is the probability ensity function of stanar normal istribution, i (ln t i µ )/ an (ln T µ )/. Therefore, we can use Eqs. (17) an (18) together with Eqs. (13) an (14) to maximize the logarithmic likelihoo in Eq. (12) for the RPP moel with prior, together with Eq. (6) to maximize the likelihoo in Eq. (5) for the RPP moel without prior. Baseline moels an evaluation metrics. We compare the RPP moel with three wiely-use moels for popularity preiction: the classic autoregression () metho (Box, Jenkins, an Reinsel 2008), the linear regression metho of logarithmic popularity () (Szabo an Huberman 2010), an the WSB moel (Wang, Song, an Barabási 2013), which is equivalent to the propose RPP moel without prior when the log-normal relaxation function is aopte. We aopt two stanar measurements as evaluation metrics: Table 1: Basic statistics of ataset. Journal #Papers #Citations Perio PRSI 1, 469 668 1893-1912 PR 47, 941 590, 665 1913-1969 PRA 53, 655 418, 196 1970-2009 PRB 137, 999 1, 191, 515 1970-2009 PRC 29, 935 202, 312 1970-2009 PRD 56, 616 526, 930 1970-2009 PRE 35, 944 154, 133 1993-2009 PRL 95, 516 1, 507, 974 1958-2009 RMP 2, 926 115, 697 1929-2009 PRSTAB 1, 257 2, 457 1998-2009 PRSTPER 90 0 2005-2009 Total 463, 348 4, 710, 547 1893-2009 Mean Absolute Percentage Error (MAPE) measures the average eviation between preicte an empirical popularity over an aggregation of items. Denoting with c (t) the preicte number of citations for a paper up to time t an with r (t) its real number of citations, we obtain the MAPE over N papers MAP E = 1 c (t) r (t) N r. (t) =1 Accuracy measures the fraction of papers correctly preicte for a given error tolerance. Hence the accuracy of popularity preiction on N papers is 1 N { : =1 c (t) r (t) r (t) apple }. We set the threshol =0.1 in this paper. Experiment Results In this section, we report two sets of experiments: (1) We compare the preictive power of RPP moel with other competing methos, fining that RPP consistently outperforms other moels; (2) We perform etaile analysis to unerstan the factors that coul affect the performance of RPP moel, incluing the length of training perio, the effective number of attentions, an the prior parameters. Popularity preiction. We evaluate the preiction results on three collections of papers: (a) papers publishe in Physical Review (PR) from 1960 to 1969; (b) papers publishe in Physical Review Letters (PRL) from 1970 to 1979; (c) papers publishe in Physical Review B (PRB) from 1980 to 1989. These samples vary in timeframes an scopes, spanning three ecaes an covering three types of journals. Using papers with more than 10 citations uring the first five years after publication, we compare the RPP moel with an without prior against the an moels. The number of papers in the three collections is 3242, 2017 an 3732, respectively. The training perio is 10 years an we preict the citation counts for each paper from the 1st to 20th year after the training perio. For collection (c), we preict the citation counts up to the 10th year after training perio ue to the cutoff year of the ata (2009). We set the parameter m = 30 for now, corresponing to the typical number of 294

0.30 5 0 0.30 5 0 0.30 5 0 MAPE MAPE MAPE Years after Training Perio (a) Physical Review (1960s) Years after Training Perio (b) Physical Review Letters (1970s) 0 2 4 6 8 10 Years after Training Perio (c) Physical Review B (1980s) Accuracy Accuracy Accuracy Years after Training Perio () Physical Review (1960s) Years after Training Perio (e) Physical Review Letters (1970s) 0 2 4 6 8 10 Years after Training Perio (f) Physical Review B (1980s) 10 2 real preicte 10 2 real preicte 10 2 real preicte # papers 10 1 # papers 10 1 # papers 10 1 10 0 10 1 10 2 10 3 # citations (g) Physical Review (1960s) 10 0 10 1 10 2 10 3 # citations (h) Physical Review Letters (1970s) 10 0 10 1 10 2 10 3 # citations (i) Physical Review B (1980s) Figure 4: The performance comparison in popularity preiction. references for a paper, leaving the effect of varying m on the performance of RPP moel for later iscussions. We fin the RPP moel, propose in this paper, achieves higher accuracy than the an methos (Figure 4). Yet in absence of prior it only exhibits moest performance in terms of MAPE, inicating that the RPP moel without prior performs well on most papers but can be skewe by large errors on a hanful of papers. This is mainly cause by its exponential epenence on the fitness parameter that sometimes yiels overfitting problem when maximum likelihoo parameter estimation is aopte. This problem is nicely avoie by incorporating conjugate prior for the fitness parameter, ocumente by the fact that the RPP moel with prior consistently outperforms the other three methos on all collections. The superiority of the RPP moel with prior, compare to the an methos, increases with the number of years after the training perio. This improvement is roote in the methoological avantage: the RPP moel is a generative probabilistic moel that explicitly moels the arrival process of attentions, while the two baseline methos only capture the correlation between early popularity an future popularity, linearly or logarithmically. In aition, the reinforce Poisson process coul moel the rich-get-richer phenomenon in popularity ynamics an thus coul characterize the logarithmic correlation between early popularity an future popularity. Therefore, when compare with the metho, the superiority is more obvious than being compare with the metho. This is because of the linear nature of the metho, while the metho works in a logarithmic manner. Furthermore, the RPP moels with an without prior are traine only on the popularity ynamics uring training perio while the training of the an moels epen on the knowlege of future popularity ynamics. When training these two moels, we employ the leave-one-out technique 295

Mean of Prior Distribution Mean Variance <MAPE> 4 5 6 7 8 9 10 11 12 13 14 15 16 Years for Training 0.30 5 0 Figure 5: Effect of training perio length. which uses all papers except the target paper for preiction. Yet, in most cases, it is unrealistic to know future popularity ynamics when training the moel, limiting their applications in real scenarios. Finally, being a generative moel, the RPP moel is able to reprouce the citation istribution. Inee, as shown in Figure 4 (g-i), the istribution of citations preicte by the RPP moel with prior matches very well with that of real citations on all stuie collections, inicating that the RPP moel can also be use to moel aggregate properties of a system. Moreover, for completeness, we offer the values of prior parameters an for the three collections of papers: =5.322 an =6.796 for collection (a); =5.703 an = 7.901 for collection (b); = 5.000 an = 5.827 for collection (c). Analysis of relevant factors. The superior preictive power in the RPP moel with prior raises an interesting question: what are the possible factors that affect its preictive power? In this section, we stuy a number of factors which coul affect the performance of the RPP moel with prior. Hereafter, we use hmapei to enote the average MAPEs for preictions from the 1st to 10th year after training perio. The training perio is 10 years except when we iscuss the effect of varying training perio length. The parameter m is set to be 30 except when we iscuss the effect of changing m. First, we stuy the preiction accuracy of the RPP moel with prior by varying the length of training perio. Experiments are conucte on the paper collection (a). As shown in Figure 5, hmapei ecreases as the training perio increases. Hence increasing the training perio improves the preiction accuracy. However, the rate at which hmapei iminishes slows own quickly, inicating the marginal gain of increasing training perio. We also fin that the mean of prior istribution stays almost constant as the length of training perio increases from 5 years to 15 years, inicating the expecte fitness parameter learne by the RPP moel is robust against varying training perio. At the same time, increasing training perio reuces the role of prior in preiction, partly explaining the role of prior in overcoming the overfitting problem, as the variance in the prior istributions increases with the length of training perio. Secon, we investigate the effect of parameter m, i.e., the effective number of attentions by conucting experiments on the paper collection (a). Intuitively, m balances the strength in the reinforcement mechanism. Inee, as shown in Ta- <MAPE> / Variance of Prior Distribution Table 2: Effect of the number of conceive attentions (m). m Mean ( / ) Variance ( / 2 ) hmapei 10 1.467 0.193 762 20 05 0 776 30 0.783 0.115 781 40 47 91 784 50 0.554 74 785 Table 3: Performance on RMP papers over four ecaes. Perio / hmapei 1950s 4.237 4.061 43 75 1960s 4.759 4.440 72 84 1970s 6.130 4.924 1.245 0.111 1980s 10.706 5.379 1.990 0.120 ble 2, the mean an variance of the prior istribution ecay with m, emonstrating these parameters are mainly etermine by papers with fewer citations. We also fin that ecreasing m reuces hmapei, inicating that the isparity in citations is capture appropriately by the reinforcement mechanism in our moel, as a larger m implies a weaker role of the reinforcement mechanism. Taken together, Table 2 confirms that the reinforcement mechanism is crucial to moeling popularity ynamics in citation system. Finally, we use papers publishe in Reviews of Moern Physics (RMP) to illustrate the change of prior parameter an over four ecaes an their influence on the preiction accuracy of the RPP moel with prior. As shown in Table 3, the mean of prior istribution (i.e., / ) increases with the increasing magnitue of both an over the four ecaes. This inicates that the expecte citations for papers in this prestigious journal steaily increases uring the secon half of the 20th century. Meanwhile, the hmapei of the RPP moel also increases. Hence it becomes more ifficult to preict the citations of these papers, as a result of the increasing isparity in citation istribution (Barabási et al. 2012). Conclusions Taken together, we presente a general framework to moel an preict popularity ynamics base on a reinforce Poisson process. This moel incorporates three key ingreients of popularity ynamics: the fitness parameter characterizing intrinsic attractiveness, the temporal relaxation function explaining the aging effect in attracting new attentions, an the reinforcement mechanism corresponing to the richget-richer effect in popularity ynamics. Being a generative probabilistic framework, it explicitly moels the stochastic process of gaining popularity for each item, in contrast to existing eterministic approaches. We evelope optimization methos to train the propose RPP moel with an without priors. The RPP moel with prior allows us to apply the Bayesian treatment, resulting in more robust an accurate preictions for popularity ynamics. We empirically valiate our moel on an excellent longituinal ataset on citations, spanning more than one hunre years, emonstrating its clear avantages over competing methos. 296

Acknowlegments This work was fune by the National Basic Research Program of China (973 Program) uner grant number 2014CB340401, the National High-tech R&D Program of China (863 Program) uner grant number 2014AA015103, an the National Natural Science Founation of China uner grant numbers 61202215, 61232010. This work was also partly fune by the Beijing Natural Science Founation uner grant number 4122077. DW, CS, ALB are supporte by Lockhee Martin Corporation (SRA 11.18.11), the Network Science Collaborative Technology Alliance is sponsore by the U.S. Army Research Laboratory uner agreement W911NF-09-2-0053, Defence Avance Research Projects Agency uner agreement 11645021, an the Future an Emerging Technologies Project 317 532 Multiplex finance by the European Commission. References Backstrom, L.; Huttenlocher, D.; Kleinberg, J.; an Lan, X. 2006. Group formation in large social networks: Membership, growth, an evolution. In KDD 06, 44 54. Bao, P.; Shen, H. W.; Chen, W.; an Cheng, X. Q. 2013a. Cumulative effect in information iffusion: empirical stuy on a microblogging network. PLoS ONE 8(10):e76027. Bao, P.; Shen, H. W.; Huang, J.; an Cheng, X. Q. 2013b. Popularity preiction in microblogging network: A case stuy on sina weibo. In WWW 13, 177 178. Barabási, A. L., an Albert, R. 1999. Emergence of scaling in ranom networks. Science 286(5439):509 512. Barabási, A. L.; Song, C.; an Wang, D. 2012. Publishing: Hanful of papers ominates citation. Nature, 491(7422): 40 40. Bass, F. M. 1969. A new prouct growth for moel consumer urables. Management Science 15(5):215 227. Box, G. E. P.; Jenkins, G. M.; an Reinsel, G. C. 2008. Time Series Analysis: Forecasting an Control. Wiley, 4th eition. Crane, R., an Sornette, D. 2008. Robust ynamic classes reveale by measuring the response function of a social system. PNAS 105(41):15649 15653. Dezso, Z.; Almaas, E.; Lukács, A.; Rácz, B.; Szakaát, I.; an Barabási, A. L. 2006. Dynamics of information access on the web. Physical Review E 73:066132. Gomez-Roriguez, M.; Leskovec, J.; an Schölkopf, B. 2013. Moeling information propagation with survival theory. In ICML 13. Hong, L.; Dan, O.; an Davison, B. D. 2011. Preicting popular messages in twitter. In WWW 11, 57 58. Kempe, D.; Kleinberg, J.; an Taros, E. 2003. Maximizing the sprea of influence through a social network. In KDD 03, 137 146. Lerman, K., an Hogg, T. 2010. Using a moel of social ynamics to preict popularity of news. In WWW 10, 621 630. Lerman, K., an Hogg, T. 2012. Using stochastic moels to escribe an preict social ynamics of web users. ACM Transactions on Intelligent Systems an Technology 3(4):62. Mahajan, V.; Muller, E.; an Bass, F. M. 1990. New prouct iffusion moels in marketing: A review an irections for research. The Journal of Marketing 54:1 26. Matsubara, Y.; Sakurai, Y.; Prakash, B. A.; Li, L.; an Faloutsos, C. 2012. Rise an fall patterns of information iffusion: Moel an implications. In KDD 12, 6 14. Pemantle, R. 2007. A survey of ranom processes with reinforcement. Probability Surveys 4:1 79. Raicchi, F.; Fortunato, S.; an Castellano, C. 2008. Universality of citation istribution: towar an objective measure of scientific impact. PNAS 105(45):17268 17272. Ratkiewicz, J.; Fortunato, S.; Flammini, A.; Menczer, F.; an Vespignani, A. 2010. Characterizing an moeling the ynamics of online popularity. Physical Review Letters 105(15):158701. Ren, F. X.; Shen, H. W.; an Cheng, X. Q. 2012. Moeling the clustering in citation networks. Physica A 391(12): 3533 3539. Szabo, G., an Huberman, B. A. 2010. Preicting the popularity of online content. Communications of the ACM 53(8):80 88. Ulrich, R., an Miller, J. 1993. Information processing moels generating lognormally istribute reaction times. J. Math. Psychol. 37(4): 513 525. Vu, D. Q.; Asuncion, A. U.; Hunter, D. R.; an Smyth, P. 2011. Dynamic egocentric moels for citation networks. In ICML 11, 857 864. Wang, D.; Song, C.; an Barabási, A. L. 2013. Quantifying long-term scientific impact. Science 342:127 132. Wu, F., an Humberman, B. 2007. Novelty an collective attention. PNAS 104(45):17599 17601. Yan, R.; Tang, J.; Liu, X.; Shan, D.; an Li, X. 2011. Citation count preiction: learning to estimate future citations for literature. In CIKM 11, 1247 1252. Yang, J., an Leskovec, J. 2010. Moeling information iffusion in implict networks. In ICDM 10, 599 608. Yang, S. H., an Zha, H. 2013. Mixture of mutually exciting processes for viral iffusion. In ICML 13, 1 9. Yu, X.; Gu, Q.; Zhou, M.; an Han, J. 2012. Citation preiction in heterogeneous bibliographic networks. In SDM 12, 1119 1130. 297