Exploring Imge Virlity in Google Plus Mro Guerini Trento RISE Trento - Itly Emil: mro.guerini@trentorise.eu Jopo Stino University of Trento Trento - Itly Emil: stino@disi.unitn.it Dvide Alnese Fondzione Edmund Mh, CRI-CBC Sn Mihele lladige (TN) - Itly Emil: dvide.lnese@fmh.it Astrt Retions to posts in n online soil network show different dynmis depending on severl textul fetures of the orresponding ontent. Do similr dynmis exist when imges re posted? Exploiting novel dtset of posts, gthered from the most populr Google+ users, we try to give n nswer to suh question. We desrie severl virlity phenomen tht emerge when tking into ount visul hrteristis of imges (suh s orienttion, men sturtion, et.). We lso provide hypotheses nd potentil explntions for the dynmis ehind them, nd inlude ses for whih ommon-sense expettions do not hold true in our experiments. I. INTRODUCTION How do things eome virl on the Internet? And wht extly do we men y influene? Sine mrketing nd industry people wnt their messges to spred in the most effetive nd effiient wy possile, these questions hve reeived gret del of ttention, prtiulrly in reent yers, s we hve seen drmti growth of soil networking on the We. Generlly speking, virlity refers to the tendeny of ontent either to spred quikly within ommunity or to reeive gret del of ttention y it. In studying the spreding proess we will fous on the ontent nd its hrteristis, rther thn on the struture of the network through whih the informtion is moving. In prtiulr, we will investigte the reltionships etween visul hrteristis of imges enlosed in Google+ posts nd virlity phenomen. We will use three virlity metris: plusoners, replies nd reshrers. This explortory work stems from the use people mke of soil networking wesites suh s Google+, Feook nd similr: we hypothesized tht pereptul hrteristis of n imge ould indeed ffet the virlity of the post emedding it, nd tht for exmple rtoons, pnorm or self-portrits piture ffet users retions in different wys. The im of this pper is to investigte whether signs of suh ommonsense intuition emerge from lrge-sle dt mde ville on populr soil networking wesites like Google+ nd, in suh se, to open disussion on the ssoited phenomen. The pper is strutured s follows: first, we review previous works ddressing the topi of virlity in soil networks, nd prtiulrly some fousing on ontent impt. Then, fter desriing the dtset olleted nd used for this work, we proeed with the study of virlity of Google+ posts nd the hrteristis of their ontent. We lso disuss the ehvior of virlity indexes in terms of their lterntive use, rguing tht plusones nd omments fulfill similr purpose of followers ppreition while reshres hve different role of selfrepresenttion. Finlly, we investigte possile intertions etween imge hrteristis nd users typology, in order to understnd to wht extent results re generlizle or typil of ommunity, gthered round ommon interest. II. RELATED WORKS Severl reserhers studied informtion flow, ommunity uilding nd similr proesses using Soil Networking sites s referene [1], [2], [3], [4]. However, the gret mjority onentrtes on network-relted fetures without tking into ount the tul ontent spreding within the network [5]. A hyrid pproh fousing on oth produt hrteristis nd network relted fetures is presented in [6]: the uthors study the effet of pssive-rodst nd tive-personlized notifitions emedded in n pplition to foster word of mouth. Reently, the orreltion etween ontent hrteristis nd virlity hs egun to e investigted, espeilly with regrd to textul ontent; in [7], for exmple, fetures derived from sentiment nlysis of omments re used to predit the populrity of stories. The work presented in [8] uses New York Times rtiles to exmine the reltionship etween emotions evoked y the ontent nd virlity, using semi-utomted sentiment nlysis to quntify the ffetivity nd emotionlity of eh rtile. Results suggest strong reltionship etween ffet nd virlity; still, the virlity metri onsidered is interesting ut very limited: it only onsists of how mny people emiled the rtile. The relevnt work in [9] mesures different form of ontent spreding y nlyzing whih re the fetures of movie quote tht mke it memorle online. Another pproh to ontent virlity, somehow omplementry to the previous one, is presented in [10], trying to understnd whih modifition dynmis mke meme spred from one person to nother (while movie quotes spred remining extly the sme). More reently, some works tried to investigte how different textul ontents give rise to different retions in the udiene: the work presented in [11] orreltes severl virl phenomen with the wording of post, while [12] show tht speifi ontent fetures vritions (like the redility level of n strt) differentite mong virlity level of downlods, ookmrking, nd ittions. Still, to our knowledge, no ttempt hs een mde yet to investigte the reltion etween visul ontent hrteristis nd virlity. III. DATA DESCRIPTION Using the Google+ API 1, we hrvested the puli posts from the 979 top followed users in 1 https://developers.google.om/+/pi/
Google+ (plus.google.om), s reported y the soilsttistis.om wesite on Mrh 2nd 2012 2. The time spn for the hrvesting is one yer, from June 28th 2011 (Google+ dte of lunh) to June 29th 2012. We deided to fous on the most populr users for severl resons: (i) the dtset is uniform from the point of view of smple role, i.e. VIPs, (ii) the ehvior of the followers is onsistent e.g. no friendship dynmis nd (iii) extrneous effets due to followers network is minimized, sine top followed users network is vst enough to grnt tht, if ontent is virl, ertin mount of retions will e otined. We defined 3 susets of our dtset, omprising respetively: (i) posts ontining stti imge, (ii) posts ontining n nimted imge (usully, gif), (iii) posts without tthments (text-only). All other posts (ontining s tthment videos, photo lums, links to externl soures) were disrded. Sttistis for our dtset re reported in Tle I. For eh post, we onsidered three virlity metris 3 : Plusoners: the numer of people who +1 d; Replies: the numer of omments; Reshrers: the numer of people who reshred. TABLE I. AN OVERVIEW OF THE GOOGLE+ DATASET. Glol tors 979 posts 289434 pulished intervl 6/28/11 6/29/12 Posts with stti imges tors 950 posts 173860 min/mx/medin posts per tor 1/3685/ 65.5 min/mx/medin plusoners per post 0/9703/33.0 min/mx/medin replies per post 0/571/12.0 min/mx/medin reshrers per post 0/6564/4.0 Posts with nimted imges tors 344 posts: 12577 min/mx/medin posts per tor 1/2262/3.0 min/mx/medin plusoners per post 0/5145/17.0 min/mx/medin replies per post 0/500/7.0 min/mx/medin reshrers per post 0/6778/1 Posts without tthments, text-only tors 939 posts 102997 min/mx/medin posts per tor 1/1744/4 min/mx/medin plusoners per post 0/20299 /16.0 min/mx/medin replies per post 0/538/17.0 min/mx/medin reshrers per post 0/13566/ Replies ount is ut round 500 y the API servie. In Figures 1 nd 2 we disply the evolution over time of the network underlying our dtset (using week s temporl unit), nd of the retions to posts given y users, respetively. We notie tht: 1) the verge numer of retions per user shows quite different trends depending on the metri onsidered: while replies tend not to e ffeted y the growth of the network, reshres nd, to lesser degree plusones, show n ever-growing trend. 2) The temporl plot of the verge numer of followers per user (Figure 1) in our dtset (in Google+ terminology, 2 The dtset presented nd used in this work will e mde ville to the ommunity for reserh purposes. 3 Sine the API provide only n ggregte numer, we nnot mke ny temporl nlysis of how retions to post were umulted over time. the numer of people who irled them) shows grdient inrese round weeks 28/29. Interestingly, this is refleted in the plot of retions over time (Figure 2): the grdient inreses round the sme weeks, for reshres nd plusones; these effets re most proly due to Google+ trnsitioning from et to puli in lte Septemer 2011 ( similr phenomenon is reported lso in [13]). 3) Finlly, the orders of mgnitude of suh growths re very different: we notie tht while retions inrese of ftor of 7 over the time period we took into ount, the totl numer of followers inresed of ftor of 25. Avg. numer of followers Fig. 1. Avg. numer of retions 600000 500000 400000 300000 200000 100000 250 200 150 100 50 0 0 10 20 30 40 50 60 Week Averge numer of followers per user, t 1-week temporl grnulrity. Plusoners Reshrers Replies 0 0 10 20 30 40 50 60 Week Fig. 2. Averge numer of retions per user, t 1-week temporl grnulrity. This vlue represents the verge numer of retions eliited y eh user s posts over 1-week time-slies. The reltive mount of followers retions does not signifintly inrese s the network grows 4. As detiled in the next setion, our nlyses re sed on ompring proility distriutions: e.g. we evlute if grysle imges hve signifintly higher or smller proility of rehing ertin virlity sore thn olored ones. In the following nlyses, for the ske of lrity, our disussion will not tke into ount the normliztion ftor (i.e. the size of the udiene when ontent is posted). Indeed, we hve run the sme nlyses normlizing the virlity indexes of given post ginst its potentil udiene: i) the effets re still visile, ii) the effets re onsistent oth in signifine nd sign with the not-normlized distriutions, ut iii) differenes hve lower mgnitude (explined y the ft tht virlity indexes should e normlized using the tul udiene e.g. the 4 It hs een noted how (see, for instne, http://on.wsj.om/zjrr06), espeilly in the time frme we onsider, users tivity did not inrese muh in front of the exploding network size.
followers exposed to the ontent). Thus, sine we re interested in ompring the virlity of different imge tegories nd our preliminry experiments showed tht y normlizing the indexes their omprisons, their sign, nd the derived interprettions still hold, we hoose to report the non-normlized version of the results tht re more intuitively redle. In the following setions, fter the nlyses of text-only posts nd of posts ontining n nimted imge, we will onsider the suset of stti imges s the referene dtset. Exemplr pitures tken from the dtset re shown in Figure 3, depiting some imge tegories tht we will tke into ount in the following setions. Fig. 3. Exemplr pitures from the dtset. IV. DATA ANALYSIS Virlity metris in our dtset follow power-lw-like distriution thikening towrd low virlity sore. In order to evlute the virlity power of the fetures tken into ount, we ompre the virlity indexes in terms of empiril Complementry Cumultive Distriution Funtions (CCDFs). These funtions re ommonly used to nlyse online soil networks in terms of growth in size nd tivity (see for exmple [14], [15], or the disussion presented in [16]) nd lso for mesuring ontent diffusion, e.g. the numer of retweets of given ontent [17]. Bsilly, these funtions ount for the proility p tht virlity index will e greter thn n nd re defined s follows: = numer of posts with virlity index > n totl numer of posts For exmple, the proility of hving post with more thn 75 plusoners is indited with ˆF plus (75) = P(#plusoners > 75). In the following setions we use CCDFs to understnd the reltion etween imge hrteristis nd post virlity; in order to ssess whether the CCDFs of the (1) severl types of posts we tke into ount show signifint differenes, we will use the Kolmogorov-Smirnov (K S) goodness-of-fit test, whih speifilly trgets umultive distriution funtions. A. Imge vs. text-only First of ll, we im to understnd wht is the impt of dding n imge to post in Google+. Some studies [18] lredy show tht posts ontining n imge re muh more virl thn simple plin-text posts, nd tht vrious hrteristis of imge sed nners ffet viewer s rell nd liks [19]. This finding n e explined in light of rpid ognition model [20], [21]. In this model, the user hs to deide in limited mount of time, nd within vst informtion flow of posts, whether to tke n tion on prtiulr post (e.g. to reply, reshre, give it plusone). Thus, pitures, nd the hrteristis thereof nlyzed in the following setions, might ply role of prmount importne in her deision-mking proess s she exploits visul ues tht gr her ttention. In some respets, the rpid ognition model is reminisent of the mehnisms y whih humns routinely mke judgments out strngers personlity nd ehvior from very short ehviorl sequenes nd non-verl ues [22], [23]. In order to investigte the generl impt of imges we ompred posts ontining piture with posts ontining only text. While our findings overll oinide with [18], some interesting phenomen emerged. First, we see tht the proility for post with n imge to hve high numer of reshrers is lmost three times greter ( ˆF resh (10) = 8 vs. 0.10, K S test p < 01), see Figure 4.. Still, the CCDFs for the other virlity indexes show different trends: Posts ontining imges hve lower proility of eing virl when it omes to numer of omments ( ˆF repl (50) = 0.33 vs. 2, K S test p < 01), see Figure 4.. This n e explined y the ft tht text-only posts eliit more linguisti-elortion thn imges (we lso expet tht the verge length of omments is higher for text-only posts ut we do not investigte this issue here). Also, if we fous on simple ppreition (plusoners in Figure 4.), results re very intriguing: while up to out 75 plusoners the proility of hving posts ontining imges is higher, fter this threshold the sitution psizes. This finding n e of support to the hypothesis tht, while it is esier to impress with imges in the informtion flow s rgued with the forementioned rpid ognition model high qulity textul ontent n impress more. B. Stti vs. Animted Animted imges dd further dimension to pitures expressivity. Hving een round sine the eginning of the Internet (the gif formt ws introdued in lte 80 s), nimted imges hve hd lternte fortune, espeilly fter the wide spred of servies like youtue nd the vilility of rodnd. Nonetheless, they re still extensively used to produe simple nimtions nd short lips. Notiely, the vlue of simple nd short nimtions hs een knowledged y Twitter with the reently relesed Vine servie.
with imge without tthments stti imge nimted imge numer of replies (n) numer of replies (n) numer of reshrers (n) numer of reshrers (n) Fig. 4. Virlity CCDFs for posts with imge vs. text-only posts. Fig. 5. Virlity CCDFs for stti vs. nimted imges. Whether post ontins stti or nimted imge hs strong disrimintive impt on ll virlity indexes, see Figure 5. With respet to plusoners nd replies, stti imges tend to show higher CCDFs (respetively two nd three times more, ˆF plus (75) = 0.30 vs. 0.17, ˆFrepl (50) = 2 vs. 8, K S test p < 01), while on reshrers the opposite holds. The ft tht ˆFresh (n) is two times higher for posts ontining nimted imges ( ˆF resh (10) = 8 vs. 7, K S test p <.001) n e potentilly explined y the ft tht nimted imges re usully uilt to onvey smll memeti lip - i.e. funny, ute or quirky situtions s suggested in [24]. In order to verify this hypothesis we hve nnotted smll rndom susmple of 200 imges. 81% of these nimted imges were found to e memeti (two nnottors were used, positive exmple if the imge sore 1 t lest on one of the forementioned dimensions, nnottor greement is very high Cohen s kpp 0.78). These findings indite tht nimted imges re minly vehile for musement, t lest on Google+. C. Imge Orienttion We then foused on the question whether imge orienttion (lndspe, portrit nd squred) hs ny impt on virlity indexes. We inluded squred imges in our nlysis sine they re typil of populr servies l Instgrm. These servies enle users to pply digitl filters to the pitures they tke nd onfine photos to squred shpe, similr to Kodk Instmti nd Polroids, providing so-lled vintge effet. We hve nnotted smll rndom susmple of 200 imges. 55% of these imges were found to e Instgrmmed (two nnottors were used, positive exmple if the imge Fig. 6. numer of replies (n) vertil squre horizontl numer of reshrers (n) Virlity CCDFs for imge orienttion. is lerly reognized s modified with filter; nnottor greement is high Cohen s kpp 8). Note tht, if we inlude lso lk nd white squred pitures without ny other prtiulr filter pplied (/w is one of the si filter provided y Instgrm) the mount of Instgrmmed pitures rises to 65%. Oviously, the rtio of pitures modified with this
nd similr servies ould e higher; here, we rther wnted to identify those pitures tht were lerly reognized s seeking for the forementioned vintge effet. While the orienttion seems not to hve strong impt on reshrers, with mild prevlene of horizontl pitures (see Figure 6.), plusoners nd replies tend to well disriminte mong vrious imge orienttions. In prtiulr, portrit imges show higher proility of eing virl thn squred imges thn, in turn, lndspes (see Figure 6. nd 6.). Furthermore, CCDFs indite tht vertil imges tend to e more virl thn horizontl ones ( ˆF plus (75) = 0.38 vs. 6, ˆF repl (50) = 0.38 vs. 0.17, K S test p < 01). Hene, while squred imges ple themselves in the middle in ny metri, lndspe imges hve lower virl proility for plusones nd replies ut slightly higher proility for reshres. This n e prtilly explined y the ft tht we re nlyzing elerities posts. If the vertilly-orientted imge ontins the portrit of elerity this is more likely to e ppreited rther tht reshred, sine the t of reshring n lso e seen s form of self-representtion of the follower (we will nlyze the impt of pitures ontining fes in the following setion). The opposite holds for lndspes, i.e. they re more likely to e reshred nd used for self-representtion. D. Imges ontining one fe In trditionl mono-diretionl medi (e.g. tv, illords, et.) widely used promotion strtegy is the use of testimonils, espeilly elerities endorsing produt. Is the sme strtegy pplile to Soil Medi? Understnding the effet of posting imges with fes y most populr Google+ users (nd hypothesizing tht those re their fes) is first step in the diretion of finding n nswer. We omputed how mny fes re found in the imges, long with the rtio of the re tht inlude fes nd the whole imge re, using the Viol-Jones [25] fe detetion lgorithm. We onsidered imges ontining one fe vs. imges ontining no fes. We did not onsider the surfe of imge oupied y the fe (i.e. if it is lose-up portrit, or just smll fe within igger piture). The disrimintive effet of ontining fe on virlity is sttistilly signifint ut smll. Still, the pitures ontining fes tend to hve mild effet on reshrers (slightly higher replies nd plusoners ut lower reshrers s ompred to imges with no fes). In order to verify the hypothesis mentioned erlier, i.e. tht self-portrits tend to e reshred less, we lso foused on susmple of imges ontining fes tht over t lest 10% of the imge surfe (out 6400 instnes). In this se, the differenes mong indexes polrize little more (higher plusoners nd omments, lower reshrers), s we were expeting. Unfortuntely, imges with even higher fe/surfe rtio re too few to further verify the hypotheses. E. Grysle vs. Colored The impt nd mening of lk-nd-white (i.e. grysle) photogrphi imges hs een studied from different perspetives (e.g. semiotis nd psyhology) nd with referene to different fields (from doumentry to rts nd dvertising). Fig. 7. numer of replies (n) men right. 5 men right. > 5 numer of reshrers (n) Virlity CCDFs for imge Brightness. Rudolf Arnheim, for exmple, rgues tht olor produes essentilly emotionl experiene, wheres shpe orresponds to intelletul plesure [26]. Hene, lk-nd-white photogrphy, euse of its sene of expressive olors, fouses on shpes tht require intelletul refletion nd rings to explore estheti possiilities. We wnt to understnd if suh funtions nd effets n e spotted in our virlity indexes. In order to hve pereptul grysle (some imges my ontin highly desturted olors nd so pereived s shdes of gry) we dihotomized the dtset ording to the men-sturtion index of the imges, using very onservtive threshold of 5 (on 0-1 sle). As n e seen in Figure 8. nd 8., olored imges (with sturtion higher thn 5) hve higher proility of olleting more plusoners nd replies s ompred to imges with lower sturtion (grysle). In prtiulr the proility funtions for replies is more thn two times higher ( ˆF repl (50) vlues re 6 vs. 0.10, K S test p < 01). Insted, imge sturtion hs no relevnt impt on reshrers. F. Very Bright Imges After onverting eh imge in our dtset to the HSB olor spe, we extrted its men Sturtion nd Brightness. More in detil, the HSB (Hue/Sturtion/Brightness) olor spe desries eh pixel in n imge s point on ylinder: the Hue dimension representing its olor within the set of primry-seondry ones, while Sturtion nd Brightness desrie respetively how lose to the pure olor (i.e. its Hue), nd how right it is.we split the dtset ording to imges men rightness using threshold of 5 (in sle inluded etween 0 nd 1). Usully imges with suh n high men rightness tend to e rtoon-like imges rther thn pitures.
Fig. 8. numer of replies (n) men st. 5 men st. > 5 numer of reshrers (n) Virlity CCDFs for Grysle vs. Colored imges. Previous reserh [27] hs shown tht pixel rightness is expeted to e higher in rtoon-like (or signifintly photoshopped ) thn in nturl imges. Imge rightness level hs strong impt on plusoners nd replies, nd milder one on reshrers. Brighter imges hve lower proility of eing virl on the first two indexes (Figure 7. nd 7.) nd higher proility on the ltter (Figure 7.). In prtiulr, lower rightness imges hve plusone nd reshre proility lmost two times higher ( ˆF plus (75) = 0.31 vs. 0.18, ˆFrepl (50) = 3 vs. 0.12, K S test p < 01), while for reshrers it is 27% higher in fvor of high rightness imges ( ˆF resh (10) = 0.33 vs. 6). Surprisingly, nlyzing smll rndom susmple of 200 very right imges, we found tht while 88% of these imges ontined some text, s we would hve expeted, only 13% were rtoon/omis nd only 13% ontined the rel piture of n ojet s sujet, even if highly photoshopped. Aove ll, only smll mount of these imges (21%) ws onsidered funny or memeti 5. The gret mjority omprised pitures ontining infogrphis, sreenshots of softwre progrms, sreenshots of soil-networks posts nd similr. In this respet we re nlyzing ontent tht is ment to e minly informtive, nd is somehow omplementry to the ontent of nimted pitures (minly intended for musement, see IV-B). G. Vertil nd Horizontl edges Finlly, we wnt to report on n explortive investigtion we mde. We foused on the impt of edges intensity on posts virlity. The intensity of vertil/horizontl/digonl edges ws 5 Two nnottors were used, four inry tegories were provided (ontintext/omis/rel-piture-oj/funny). The overll inter-nnottor greement on these tegories is high, Cohen s kpp 0.74. omputed using Gussin filters, sed on ode used in [28] in the ontext of rel-time visul onept lssifition. The proility density of the verge edges intensity follows gussin-like distriution, with men of out 8 (oth for horizontl nd vertil edges). We divided imges into two groups: those hving n verge edge intensity elow the smple men, nd those hving n verge edge intensity ove the men. Results showed tht imges with horizontl edge intensity elow the smple men re fr more virl on the plusoners nd replies indexes, while vertil re less disrimintive. Results for horizontl hedges re s follows: ˆF plus (75) = 0.36 vs. 2, ˆF repl (50) = 7 vs. 0.14, ˆF resh (10) = 5 vs. 9, K S test p < 01. While these results do not hve n intuitive explntion, they lerly show tht there is room for further investigting the impt of edges. H. Virlity Indexes Correltion From the nlyses ove, virlity indexes seem to move together (in prtiulr plusoners nd replies) while reshrers pper to indite different phenomenon. We hypothesize tht plusoners nd replies n e onsidered s form of endorsement, while reshres re form of self-representtion. This explins why, for exmple, pitures ontining fes re endorsed ut not used for self-representtion y VIPs followers. On the ontrry, nimted imges tht usully ontin funny mteril re more likely to provoke reshres for followers self-representtion. In ft, people usully tend to represent themselves with positive feelings rther thn negtive ones (espeilly populr users, see [29]), nd positive moods pper to e ssoited with soil intertions [30], [31]. TABLE II. VIRALITY INDEXES CORRELATION ON THE DATASETS Person MIC Stti imges plusoners vs. replies 0.723 33 plusoners vs. reshrers 0.550 17 replies vs. reshrers 20 0.126 Animted Imges plusoners vs. replies 0.702 0.304 plusoners vs. reshrers 0.787 0.396 replies vs. reshrers 0.554 05 Text Only plusoners vs. replies 02 0.529 plusoners vs. reshrers 85 73 replies vs. reshrers 0.172 0.185 This is supported lso y the orreltion nlysis of the three virlity indexes, reported in Tle II, mde on the vrious dtsets we exploited. In this nlysis we used oth the Person oeffiient nd the reent Mximl Informtion Coeffiient (MIC), onsidering plusoners 1200, replies 400 e reshrers 400. MIC is mesure of dependene introdued in [32] nd it is prt of the Mximl Informtionsed Nonprmetri Explortion (MINE) fmily of sttistis. MIC is le to pture vrile reltionships of different nture, penlizing similr levels of noise in the sme wy. In this study we use the Python pkge minepy [33]. In prtiulr, from Tle II we see tht: plusones nd replies lwys hve high orreltion while replies nd reshrers lwys orrelte low. Plusoners nd reshres, tht hve mild orreltion in most ses, orrelte highly when it omes to funny pitures, i.e. nimted ones. This n e explined y speifi proedurl effet: the follower expresses his/her
ppreition for the funny piture nd, fter tht, he/she reshres the ontent. Sine reshring implies lso writing omment in the new post, the reply is likely not to e dded to the originl VIP s post. In Tle III we sum up the min findings of the pper, ompring the vrious CCDFs: nimted imges nd infogrphis hve muh higher proility of eing reshred, while olored imges or imges ontining fes hve higher proility of eing ppreited or ommented. Finlly, lk nd white pitures (grysle) turn out to e the lest virl on Google+. TABLE III. SUMMARY OF MAIN FINDINGS OF THE ANALYSIS. ˆF plus (75) ˆFrepl (50) ˆFresh (10) very right 0.18 0.12 0.33 grysle 1 0.11 8 olor 0.31 4 7 nimted 0.17 8 8 one-fe > 10% re 0.35 0.30 3 V. USER ANALYSIS Finlly, we investigte if there is ny relevnt intertion etween imges hrteristis nd VIP s typology. In Tle IV we report demogrphi detils 6 on the Google+ dtset, s provided y the users in their profile pges. TABLE IV. USER DEMOGRAPHICS IN THE GOOGLE+ DATASET. User-tegory Femle (%) Mle (%) Neutrl (%) Totl (%) Tehnology 35 (19%) 110 (61%) 36 (20%) 181 (19%) Photogrphy 41 (24%) 130 (76%) 1 (1%) 172 (18%) Musi 96 (59%) 48 (29%) 19 (12%) 163 (17%) Writing 26 (21%) 76 (63%) 19 (16%) 121 (13%) Ator 21 (36%) 34 (59%) 3 (5%) 58 (6%) Entrepreneur 12 (29%) 29 (71%) - 41 (4%) Sport - 22 (55%) 18 (45%) 40 (4%) Artist 11 (31%) 21 (60%) 3 (9%) 35 (4%) TV 8 (24%) 11 (33%) 14 (42%) 33 (3%) Compny - - 28 (100%) 28 (3%) Wesite - - 23 (100%) 23 (2%) Politiin - 19 (86%) 3 (14%) 22 (2%) No Ctegory 6 (43%) 8 (57%) - 14 (1%) Orgniztion - - 9 (100%) 9 (1%) Not Aville - - 7 (100%) 7 (1%) Other 1 (33%) 2 (67%) - 3 (0%) Totl 257 (27%) 510 (54%) 183 (19%) 950 (100%) In order to investigte possile user tegory effets in our dtset tht is, if our nlyses re lso influened y the type of user posting imges rther thn y the tul ontent solely, we evluted the entropy for eh imge tegory over the 16 user tegories (s defined in Tle IV). In Tle V we report the ontingeny tle of imge-tegory entropy distriutions over user-tegories. Looking t the Kullk- Leiler (KL) divergene of speifi imge tegories with respet to the referene distriution (i.e., tken s the totl numer of imges posted y eh user-tegory), we oserve very few ut interesting effets due to speifi user-tegories. In prtiulr, while ll the KL divergenes re very smll, two of them (for Grysle nd High Brightness, reported in 6 No Ctegory denotes users tht do not provide ny personl informtion nd for whih it ws not possile to tre k their tegory; Not Aville denotes seven ounts tht were no more pulily essile when we gthered demogrphi info; Other denotes very rre nd unusul tegory definitions. The Neutrl gender refers to pges fferent to non-humns like produts, rnds, wesites, firms, et. Bold) re n order of mgnitude greter thn other lsses. Interestingly the divergene is explined minly y the distriution gp in only two User s tegories. For High Brightness the gp is minly given y Tehnology user tegory tht doules its proility distriution (from 22% to 40%) nd Musi nd Photogrphy tht redue their proility distriution to one third. This divergene from the referene distriution is onsistent with the nlysis of the ontent we mde in setion IV-F: these imges where minly infogrphis nd sreenshots of softwre progrms nd soil networks (so minly onneted to tehnology). For Grysle the gp is minly given y Photogrphy users tegory tht rises y 50% its proility distriution nd Musi, tht redues it to one third. This gp is onsistent with the ide, expressed in Setion IV-E, tht lk-nd-white photogrphy is prtiulr form of rt expressivity minly used y professionls. VI. CONCLUSIONS We hve presented study, sed on novel dtset of Google+ posts, showing tht pereptul hrteristis of n imge n strongly ffet the virlity of the post emedding it. Considering vrious kinds of imges (e.g. rtoons, pnorm or self-portrits) nd relted fetures (e.g. orienttion, nimtions) we sw tht users retions re ffeted in different wys. We provided series of nlyses to explin the underlying phenomen, using three virlity metris (nmely plusoners, replies nd reshrers). Results suggest tht plusoners nd replies move together while reshres indite distint users retion. In prtiulr, funny nd informtive imges hve muh higher proility of eing reshred ut re ssoited to different imge fetures (nimtion nd high-rightness respetively), while olored imges or imges ontining fes hve higher proility of eing ppreited nd ommented. Future work will dig deeper into the ssessment of reltions etween visul ontent nd virlity indexes, dopting multivrite nlysis tht inludes user s tegories (e.g. whih is the virl effet of /w pitures tken y professionl photogrpher s ompred to those tken y non professionl users). We will lso extend our experimentl setup in the following wys: () tking into ount ompositionl fetures of the imges, i.e. resemling onepts suh s the well-known rule of thirds ; () extrting nd exploiting desriptors suh s olor histogrms, oriented-edges histogrms; () uilding upon the vst literture ville in the ontext of sene/ojet reognition, dividing our dtset into speifi tegories in order to nlyse reltions etween tegories, suh s nturl imges or sport imges, nd their virlity. ACKNOWLEDGMENT The work of J. Stino nd M. Guerini hs een prtilly supported y the FIRB projet S-PATTERNS nd the Trento RISE PerTe projet, respetively. REFERENCES [1] K. Lermn nd R. Ghosh, Informtion ontgion: n empiril study of the spred of news on digg nd twitter soil networks, in Proeedings of ICWSM-10, Mr 2010. [2] S. Jmli nd H. Rngwl, Digging digg : Comment mining, populrity predition, nd soil network nlysis, in Proeedings of Interntionl Conferene on We Informtion Systems nd Mining, 2009.
TABLE V. CONTINGENCY TABLE OF IMAGE-CATEGORY DISTRIBUTIONS OVER USER-CATEGORIES. User-tegory Grysle Colored High Brightness Low Brightness Contining Fe Contining No Fe Squred Vertil Horizontl Totl No Ctegory 7% 6% 9% 6% 5% 7% 4% 5% 7% 6% Ator 4% 6% 5% 5% 8% 5% 5% 6% 5% 5% Artist 5% 6% 7% 6% 6% 6% 5% 7% 6% 6% Compny 0% 1% 1% 1% 1% 1% 1% 1% 1% 1% Entrepreneur 8% 7% 6% 7% 7% 7% 8% 5% 8% 7% Musi 3% 16% 3% 16% 19% 12% 15% 29% 8% 14% Not Aville 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% Orgniztion 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% Other 0% 0% 0% 0% 0% 0% 2% 0% 0% 0% Photogrphy 31% 19% 9% 22% 15% 23% 23% 14% 23% 20% Politiin 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% Sport 0% 3% 1% 3% 4% 2% 2% 2% 3% 2% Tehnology 27% 22% 40% 20% 19% 24% 16% 18% 25% 22% TV 1% 2% 1% 2% 3% 1% 5% 1% 2% 2% Wesite 1% 2% 2% 2% 2% 2% 1% 1% 2% 2% Writing 11% 10% 17% 10% 11% 10% 11% 10% 11% 11% KL-divergene 0.173 02 59 03 27 06 47 76 29 [3] E. Khiri, C.-F. Hsu, nd J. Cverlee, Anlyzing nd prediting ommunity preferene of soilly generted metdt: A se study on omments in the digg ommunity, in Proeedings of ICWSM-09, 2009. [4] J. Z. Aditeshwr Seth nd R. Cohen, A multi-disiplinry pproh for reommending welog messges, in The AAAI 2008 Workshop on Enhned Messging, 2008. [5] K. Lermn nd A. Glstyn, Anlysis of soil voting ptterns on digg, in Proeedings of the first workshop on Online soil networks, ser. WOSP 08. New York, NY, USA: ACM, 2008, pp. 7 12. [Online]. Aville: http://doi.m.org/10.1145/1397735.1397738 [6] S. Arl nd D. Wlker, Creting soil ontgion through virl produt design: A rndomized tril of peer influene in networks, Mngement Siene, vol. 57, no. 9, pp. 1623 1639, 2011. [7] S. Jmli, Comment mining, populrity predition, nd soil network nlysis, Mster s thesis, George Mson University, Firfx, VA, 2009. [8] J. A. Berger nd K. L. Milkmn, Soil Trnsmission, Emotion, nd the Virlity of Online Content, Soil Siene Reserh Network Working Pper Series, Deemer 2009. [9] C. Dnesu-Niulesu-Mizil, J. Cheng, J. Kleinerg, nd L. Lee, You hd me t hello: How phrsing ffets memorility, in Proeedings of the ACL, 2012. [10] M. Simmons, L. A. Admi, nd E. Adr, Memes online: Extrted, sutrted, injeted, nd reolleted, Proeedings of ICWSM-11, 2011. [11] M. Guerini, C. Strpprv, nd G. Özl, Exploring text virlity in soil networks, in Proeedings of ICWSM-11, Brelon, Spin, July 2011. [12] M. Guerini, A. Pepe, nd B. Lepri, Do linguisti style nd redility of sientifi strts ffet their virlity, Proeedings of ICWSM-12, 2012. [13] D. Shiöerg, S. Shmid, F. Shneider, S. Uhlig, H. Shiöerg, nd A. Feldmnn, Tring the irth of n osn: soil grph nd profile nlysis in google+, in Proeedings of the 3rd Annul ACM We Siene Conferene, ser. WeSi 12. New York, NY, USA: ACM, 2012, pp. 265 274. [Online]. Aville: http://doi.m.org/10.1145/2380718.2380753 [14] Y.-Y. Ahn, S. Hn, H. Kwk, S. Moon, nd H. Jeong, Anlysis of topologil hrteristis of huge online soil networking servies, in Proeedings of the 16th interntionl onferene on World Wide We. ACM, 2007, pp. 835 844. [15] J. Jing, C. Wilson, X. Wng, P. Hung, W. Sh, Y. Di, nd B. Y. Zho, Understnding ltent intertions in online soil networks, in Proeedings of the 10th ACM SIGCOMM onferene on Internet mesurement. ACM, 2010, pp. 369 382. [16] J. Leskove, Dynmis of lrge networks. ProQuest, 2008. [17] H. Kwk, C. Lee, H. Prk, nd S. Moon, Wht is twitter, soil network or news medi? in Proeedings of the 19th interntionl onferene on World wide we. ACM, 2010, pp. 591 600. [18] TrkSoil, Optimizing Feook Enggement, whitepper, 2012. [19] H. Li nd J. L. Bukov, Cognitive impt of nner d hrteristis: An experimentl study, Journlism & Mss Communition Qurterly, vol. 76, no. 2, pp. 341 353, 1999. [20] N. Amdy nd R. Rosenthl, Thin slies of expressive ehvior s preditors of interpersonl onsequenes: A met-nlysis. Psyhologil ulletin, vol. 111, no. 2, p. 256, 1992. [21] D. Kenny, Interpersonl pereption: A soil reltions nlysis. The Guilford Press, 1994. [22] B. Lepri, R. Surmnin, K. Klimeri, J. Stino, F. Pinesi, nd N. See, Employing soil gze nd speking tivity for utomti determintion of the extrversion trit, in Interntionl Conferene on Multimodl Interfes nd the Workshop on Mhine Lerning for Multimodl Intertion, ser. ICMI-MLMI 10. New York, NY, USA: ACM, 2010, pp. 7:1 7:8. [Online]. Aville: http://doi.m.org/10.1145/1891903.1891913 [23] J. R. Curhn nd A. Pentlnd, Thin slies of negotition: prediting outomes from onverstionl dynmis within the first 5 minutes. Journl of Applied Psyhology, vol. 92, no. 3, p. 802, 2007. [24] C. Dufour, An investigtion into the use of virl mrketing for the ompnies nd the key suess ftors of good virl mpign, Ph.D. disserttion, Dulin Business Shool, 2011. [25] P. Viol nd M. J. Jones, Roust rel-time fe detetion, Int. J. Comput. Vision, vol. 57, no. 2, pp. 137 154, My 2004. [Online]. Aville: http://dx.doi.org/10.1023/b:visi.0000013087.49260.f [26] R. Arnheim, Art nd visul pereption. Stokholms Universitet, 1987. [27] T. I. Inev, A. P. de Vries, nd H. Rohrig, Deteting rtoons: se study in utomti video-genre lssifition, in Proeedings of the 2003 Interntionl Conferene on Multimedi nd Expo - Volume 2, ser. ICME 03. Wshington, DC, USA: IEEE Computer Soiety, 2003, pp. 449 452. [Online]. Aville: http: //dl.m.org/ittion.fm?id=1170745.1171531 [28] J. Uijlings, A. Smeulders, nd R. Sh, Rel-time visul onept lssifition, Multimedi, IEEE Trnstions on, vol. 12, no. 7, pp. 665 681, nov. 2010. [29] D. Queri, J. Ellis, L. Cpr, nd J. Crowroft, In the mood for eing influentil on twitter, Proeedings of IEEE SoilCom 11, 2011. [30] J. R. Vittengl nd C. S. Holt, A time-series diry study of mood nd soil intertion, Motivtion nd Emotion, vol. 22, no. 3, pp. 255 275, 1998. [31] M. De Choudhury, S. Counts, nd M. Gmon, Not ll moods re reted equl! exploring humn emotionl sttes in soil medi, in Proeedings of ICWSM-12, 2012. [32] D. N. Reshef, Y. A. Reshef, H. K. Finune, S. R. Grossmn, G. MVen, P. J. Turnugh, E. S. Lnder, M. Mitzenmher, nd P. C. Seti, Deteting novel ssoitions in lrge dt sets, siene, vol. 334, no. 6062, pp. 1518 1524, 2011. [33] D. Alnese, M. Filosi, R. Visintiner, S. Ridonn, G. Jurmn, nd C. Furlnello, minerv nd minepy: engine for the mine suite nd its r, python nd mtl wrppers, Bioinformtis, vol. 29, no. 3, pp. 407 408, 2013.