Overview of the Stateof-the-Art. Networks. Evolution of social network studies

Size: px
Start display at page:

Download "Overview of the Stateof-the-Art. Networks. Evolution of social network studies"

Transcription

1 Overview of the Stateof-the-Art in Social Networks INF5370 spring 2013 Evolution of social network studies : mathematical studies of networks formed by the actual human interactions Pandemics, speed of gossiping, advertisements 1977: 34 people in a Karate club First scientific study of an actual network [Zachary, J. Anth. Res. 1977] Two years of field work 2003: 436 people using a corporate system [Adamic and Adar, Social Networks 2003] 2006: 43,553 people using a university system [Kossinets and Watts, Science 2006] 2007: 4,400,000 people using an online blogging service [Backstrom et al., KDD 2007] 2009: 240,000,000 people using an instant messaging service [Leskovec and Horvitz, WWW 2009] 1

2 What are social networks about? We all know what it is about Chatting with friends Blogging Following people Discussing information of interest Sharing data among friends However A dozen of major reasons for people to use social networks Thousands of possible meaningful data filters Different acceptable ethical guidelines and cultural differences Different governmental policies about data exchange Hundreds of different implementations with varying functionality Are we going to converge? Analysis of relations and social graphs The meaning of a relation or a link Flickr: bookmark LinkedIn: send messages Facebook: view (some) content or mutual agreement Links can be established between Best friends Casual acquaintances People who like to publicly disagree with each other People who have not heard of each other Are social links valid indicators of real user interaction? <1% of people on Facebook talk to >50% of their friends 50% of users interact with less than 20% of their friends 2

3 Social networks are navigable An Experimental Study of the Small World Problem by Jeffery Travers and Stanley Milgram [Sociometry, vol.32 no ] Performed by tracking snail mail communication 6 degrees of separation: short chains do exist Not only short chains exist but people can find them Using only local knowledge and information about the target What made networks navigable and how do the users pick next hop? An Experimental Study of Search in Global Social Networks by Peter Sheridan Dodds, Roby Muhamad, and Duncan J. Watts [Science, vol. 301 no August 2003] Geography dominated early in the chain Work and education dominated later in the chain In successful chains, non-close ties chosen more Other characteristics of interest Links exhibit a high degree of symmetry Unlike web links Consequently in-degree is close to out-degree Results in a smaller network diameter The distribution of degrees follows the power law There exist a core in most studied social networks A minimal connected component whose removal disconnects the graph Was discovered to include 1-10% of all the nodes in a social network Most short paths go through the core Can be used for quickly disseminating information Low-degree users show high degree of clustering Users with few friends tend to form mini-cliques Clusters are connected by bridges 3

4 Taxonomy of existing social network applications Online social networks (OSN) Twitter, Facebook, livejournal, digg, del.ici.ous Collaborative editing (Wikipedia) Integrated discussions (GoogleWave) Mobile and ad-hoc social networks Virtual Collaborative Networks Delay-Tolerant Networks (DTNs) Collaborative streaming applications Integrated with multimedia content delivery (Tribler, Spotify) General and integrated solutions Anatomy of a social network application Frontend (functionality) Data communication middleware Backend (data storage) 4

5 Challenges of Implementing Social Networks Rich functionality Non-functional property Scalability Degree of privacy Security (resilience to attacks) Data availability Real-time guarantee (e.g., for tweets, RSS, streaming applications) Rationale for decentralizing social network applications Decentralization is inherent for mobile social networks Federation of a large number of existing social networks Interoperable social networks, each network following its own rules and regulations Privacy-oriented solutions No concentration of financially valuable data at any single location The end users are endowed with better control over their own data 5

6 Yet, decentralization poses additional challenges Distributed data mining is more challenging Large-scale decentralization diminishes the computational potential of large data centers Requires creation of a new infrastructure Even supporting search is not trivial Requires a different business model The focus of this course Data analysis Challenges and proposed technological solutions Particular focus on decentralized solutions 6