!1 Security and Trust in social media networks Prof. Touradj Ebrahimi touradj.ebrahimi@epfl.ch! DMP meeting San Jose, CA, USA 11 January 2014
Social media landscape!2 http://www.fredcavazza.net/2012/02/22/social-media-landscape-2012/
Popularity of social networks!3 March 2012 May 2012 http://blog.tweetsmarter.com/social-media/spring-2012-social-media-user-statistics/
Popularity of social networks (cont.) 1+ billion photos 7+ billion photos ~140 billion photos 72 hours of video every minute April 2012
Popularity of social networks (cont.)!5 February 2012 http://www.trendweek.com/en/60-seconds-of-social-media-sharing/
Popularity of social networks (cont.)!6 January 2012 http://online.wsj.com/article/sb10001424052970204653604577249341403742390.html
Trust & security issues in social networks!7
Trust & security issues in social networks (cont.)!8 Socialcam application automatically shares with your Facebook friends videos you watch May be doing you more harm than good!
Trust & security issues in social networks (cont.)!9 Spotify and Yahoo News Activity automatically publish songs and news you have listened to or read on your profile page
Trust & security issues in social networks (cont.)!10 Responsible sharing (e.g. a friend posted an embarrassing photo of a young political candidate Emma Kiernan enjoying a party) Anybody can post any picture of you on Facebook at any time! Even if deleted, it is there forever, stored in the vast Internet memory bank to be found again and again! http://newsbizarre.com/2009/05/emma-kiernan-facebook-photo-puts-young.html
Trust & security issues in social networks (cont.)!11 Responsible sharing (e.g. top model Rosanagh Wypych posted photos of her and friends drinking alcohol and consuming cannabis) http://www.3news.co.nz/top-model-rosanagh-worried-after-facebook-pot-pic/tabid/418/articleid/224047/default.aspx
Trust & security issues in social networks (cont.)!12 Wrong tags in Flickr! Challenges: Identify the most appropriate tags Eliminate noisy or spam tags Only around 50% of tags are truly related to an image [Kennedy et al., ACM MIR 2006]
Trust & security issues in social networks (cont.)!13 Spam bookmarks in Delicious
Trust & security solutions in social networks!14
Trust & security solutions in social networks (cont.)!15 Limit number of people to share content and communications with create private groups (e.g. Google+ Circles, Facebook Smart Lists)
Trust & security solutions in social networks (cont.)!16 Limit time to watch shared content (e.g. SnapChat application though they are not able to guarantee that your messaging data will be deleted in all instances)
Trust & security solutions in social networks (cont.)!17 Manage your account (friends, photos, comments, posts) Make sure that what you are sharing or posting is not going to cause regret http://www.huffingtonpost.com/2012/03/20/social-media-privacy-infographic_n_1367223.html
Trust & security before social media!18 E-mail! Web search! Web videos! Blogs! Online shopping systems! Peer-to-peer (P2P) networks
Anti-spam strategies in online systems!19 [Heymann, Koutrika, Garcia-Molina, IEEE IC 2007]
Trust and reputation online systems!20
Trust and reputation online systems (cont.)!21 Email spam filtering
Model of social tagging system!22 [Ivanov, Vajda, Lee, Ebrahimi, IEEE SPM 2012]
Categorization of trust models!23 [Ivanov, Vajda, Lee, Ebrahimi, IEEE SPM 2012]
User versus content trust modeling!24 User trust modeling is more popular than content trust modeling: User trust models has a less complexity than content trust models User trust models can quickly adapt to the constantly evolving and changing environment in social systems due to the type of features used for modeling, and thus be applicable longer than content trust models, without need for creation of new models User trust modeling has a disadvantage of broad brush, i.e. it may be excessively strict if a user happens to post one bit of questionable content on otherwise legitimate content Subjectivity in classifying spam and non-spam content/ users remains as a fundamental issue, i.e. what is spam content/user to one person may be interesting to another one
Summary of representative recent techniques!25 Reference Trust model Media Method Dataset Gyongyi et al. 2004 Koutrika et al. 2008 Wu et al. 2005 Liu et al. 2009 content web pages an iterative approach, called TrustRank, to propagate trust scores to all nodes in the graph by splitting the trust score of a node among its neighbors according to a weighting scheme content bookmarks a coincidence-based model for query-by-tag search which estimates the level of agreement among different users in the system for a given tag content images a computer vision technique based on low-level image features to detect embedded text and computer-generated graphics content & user bookmarks an iterative approach to identify spam content by its information value extracted from the collaborative knowledge AltaVista, real Delicious, real & simulated SpamArchive & Ling-Spam, real Delicious, real Bogers and Van den Bosch 2008 content & user bookmarks KL-divergence to measure the similarity between language models and new posts BibSonomy & CiteULike, real Ivanov et al. 2012 user images an approach based on the feedback from other users who agree or disagree with a tag associated with an image Panoramio, real [Ivanov, Vajda, Lee, Ebrahimi, IEEE SPM 2012]
Summary of representative recent techniques (cont.)!26 Reference Trust model Media Method Dataset Xu et al. 2006 Krestel and Chen 2008 Benevenuto et al. 2009 Lee et al. 2010 Krause et al. 2008 Markines et al. 2009 user bookmarks an iterative approach to compute the goodness of each tag with respect to a content and the authority scores of the users user bookmarks a TrustRank-based approach using features which model tag co-occurance, content co-occurance and co-occurance of tag-content user videos a supervised learning approach applied on features that reflect users behavior through video responses user tweets a machine learning approach applied on social honeypots including users profile and tweets features user bookmarks a machine learning approach applied on a user s profile, bookmarking activity and context of tags features user bookmarks a machine learning approach applied on tag-, content- and user-based features MyWeb 2.0, real BibSonomy, real YouTube, real Twitter, real & simulated BibSonomy, real BibSonomy, real Caverlee et al. 2008 user user profiles an approach to compute a dynamic trust score, called SocialTrust, depending on the quality of MySpace, real the relationship and personalized feedback ratings [Ivanov, Vajda, Lee, Ebrahimi, IEEE SPM 2012]
Open issues & challenges!27 Publicly available datasets: Publication of datasets from different social networks and even different datasets of one social network for evaluation of trust modeling approaches is rarely found, which makes it difficult to compare results and performance of different trust models Most of the datasets provide data for evaluating only one aspect of trust modeling, either user or content trust modeling, while evaluation of the other aspect requires introducing simulated objects in the real-world social tagging datasets
Open issues & challenges (cont.)!28 Dynamics of trust: User s trust tends to vary over time according to the user s experience and evolvement of social networks Only a few approaches deal with dynamics of trust by distinguishing between recent and old tags
Open issues & challenges (cont.)!29 Multilingualism: Most of the existing trust modeling approaches based on text information assume monolingual environments Some text information may be regarded as wrong due to the language difference people from various countries, so various languages simultaneously appear in tags and comments
Open issues & challenges (cont.)!30 Interaction across social networks: How trust models across domains can be effectively connected and shared? E.g. users can use their Facebook accounts to log in some other social network services
Open issues & challenges (cont.)!31 Multimodal analysis: Most of the current techniques for noise and spam reduction focus only on textual tag processing and user profile analysis, while audio and visual content features of multimedia content can also provide useful information about the relevance of the content and content-tag relationship
References!32 B. Sigurbjornsson, and R. van Zwol, "Flickr tag recommendation based on collective knowledge," in Proc. WWW, Apr. 2008, pp. 327 336. L. S. Kennedy, S.-F. Chang, and I. V. Kozintsev, To search or to label?: Predicting the performance of searchbased automatic image classifiers, in Proc. ACM MIR, Oct. 2006, pp. 249 258. K. Liu, B. Fang, and Y. Zhang, Detecting tag spam in social tagging systems with collaborative knowledge, in Proc. IEEE FSKD, Aug. 2009, pp. 427 431. P. Heymann, G. Koutrika, and H. Garcia-Molina, Fighting spam on social web sites: A survey of approaches and future challenges, IEEE Internet Comput., vol. 11, no. 6, pp. 36 45, Nov. 2007. X. Li, C. Snoek, and M. Worring, "Learning tag relevance by neighbor voting for social image retrieval," in Proc. ACM MIR, Oct. 2008, pp. 180-187. B. Markines, C. Cattuto, and F. Menczer, Social spam detection, in Proc. ACM AIRWeb, Apr. 2009, pp. 41 48. B. Krause, C. Schmitz, A. Hotho, and Stum G., The anti-social tagger: Detecting spam in social bookmarking systems, in Proc. ACM AIRWeb, Apr. 2008, pp. 61 68. G. Koutrika, F. A. Effendi, Z. Gyongyi, P. Heymann, and H. Garcia-Molina, Combating spam in tagging systems: An evaluation, ACM TWEB, vol. 2, no. 4, pp. 22:1 22:34, Oct. 2008. Z. Gyongyi, H. Garcia-Molina, and J. Pedersen, Combating web spam with TrustRank, in Proc. VLDB, Aug. 2004, pp. 576 587. C.-T. Wu, K.-T. Cheng, Q. Zhu, and Y.-L. Wu, Using visual features for anti-spam filtering, in Proc. IEEE ICIP, Sep. 2005, vol. 3, pp. III 509 512.
References (cont.)!33 I. Ivanov, P. Vajda, J.-S. Lee, and T. Ebrahimi, "In tags we trust: Trust modeling in social tagging of multimedia content," in IEEE SPM, vol. 29, no. 2, pp. 98-107, Mar. 2012. I. Ivanov, P. Vajda, J.-S. Lee, L. Goldmann, and T. Ebrahimi, Geotag propagation in social networks based on user trust model, MTAP, vol. 56, no. 1, pp. 155-177, Jan. 2012. I. Ivanov, P. Vajda, L. Goldmann, J.-S. Lee, and T. Ebrahimi, "Object-based tag propagation for semiautomatic annotation of images," in Proc. ACM MIR, Mar. 2010, pp. 497-506. T. Bogers and A. Van den Bosch, Using language models for spam detection in social bookmarking, in Proc. ECML PKDD, Sep. 2008, pp. 1 12. Z. Xu, Y. Fu, J. Mao, and D. Su, Towards the semantic web: Collaborative tag suggestions, in Proc. ACM WWW, May 2006. R. Krestel and L. Chen, Using co-occurence of tags and resources to identify spammers, in Proc. ECML PKDD, Sep. 2008, pp. 38 46. F. Benevenuto, T. Rodrigues, V. Almeida, J. Almeida, and M. Gonc alves, Detecting spammers and content promoters in online video social networks, in Proc. ACM SIGIR, Jul. 2009, pp. 620 627. K. Lee, J. Caverlee, and S. Webb, Uncovering social spammers: social honeypots + machine learning, in Proc. ACM SIGIR, Jul. 2010, pp. 435 442. M. G. Noll, C. A. Yeung, N. Gibbins, C. Meinel, and N. Shadbolt, Telling experts from spammers: Expertise ranking in folksonomies, in Proc. ACM SIGIR, Jul. 2009, pp. 612 619. J. Caverlee, L. Liu, and S. Webb, SocialTrust: Tamper-resilient trust establishment in online communities, in Proc. ACM JCDL, Jun. 2008, pp. 104 114. A. Hotho, D. Benz, R. J aschke, and B. Krause, Eds., ECML PKDD Discovery Challenge, Sep. 2008, Available at: http://www.kde.cs.uni-kassel.de/ws/rsdc08.
Any question? Prof. Touradj Ebrahimi Touradj.Ebrahimi@epfl.ch MMSPG EPFL