Big data, big research?
|
|
|
- Janis Bennett
- 10 years ago
- Views:
Transcription
1 Big data, big research? Opportunities and constraints for computer supported social science Jürgen Pfeffer Digital Methods Vienna, Austria, November 2013
2 Agenda Look and feel of big data research How is big data research different from traditional social science research? Methodological problems Big data Online social networks How big are big data? Technical/algorithmic problems 2
3 Goals Understanding big data research approach Seeing the current limitations Feeling the future potentials 3
4 Jürgen Pfeffer Assistant Research Professor School of Computer Science Carnegie Mellon University Vienna University of Technology: BA: Computer Science PhD: Business Informatics Corporate Consultant, Freelancer Research Studios Austria Trainer for Rhetoric and Personal Performance 4
5 Jürgen Pfeffer Research focus: Computational analysis of organizations and societies Special emphasis on large scale systems Methodological and algorithmic challenges Methods: Network analysis theories and methods Visual analytics, geographic information systems Agent based simulations, system dynamics Center for Computational Analysis of Social and Organizational Systems 5
6 Challenges for Analyzing Large Scale Systems Data Mining Text Mining Data to Network Model Algorithms Change Detection Visual Analytics Geo Analysis Modeling Simulation Mining of large amounts of diverse data Automated data to network processing Dynamic network analysis and change detection Visual analytics of network data Modeling and simulation of real world networks Toward a Real Time Analysis of Large Scale Dynamic Socio Cultural Systems 6
7 Toward a Real Time Analysis of Large Scale Dynamic Socio Cultural Systems 7
8 Motivation & Hope A field is emerging that leverages the capacity to collect and analyze data at a scale that may reveal patterns of individual and group behaviors. access to terabytes of data describing minute by minute interactions and locations of entire populations of individuals [will] offer qualitatively new perspectives on collective human behavior. Lazer, D., Pentland, A., Adamic, L., Aral, S., Barabási, A. L., Brewer, D., Christakis, N., Contractor, N., Fowler, J., Gutmann, M., Jebara, T., King, G., Macy, M., Roy, D., & Van Alstyne, M. (2009). Computational social science. Science, 323,
9 Motivation & Hope Social media offers us the opportunity for the first time to both observe human behavior and interaction in real time and on a global scale. Golder, S. A., & Macy, M. W. (2012, January). Social science with social media. ASA footnotes, 40(1), 7. 9
10 Example: Interplay Social Media/Traditional Media Offline and online media reinforce one another Social media are an important information source for traditional media (Diakopoulos et al., 2012). Twitter is used as radar Social media hooks are connected to the media story Significant amount of dynamics are external events and factors outside the network (Myers et al., 2012) Online firestorms: Social Media Traditional Media Cross media dynamics 10
11 Interplay Social Media/Traditional Media Traditional Social Science approaches: Survey Twitter/Facebook users Interview journalists Observe media web sites Content analysis Etc. 11
12 Interplay Social Media/Traditional Media Data driven approach: Contrast Arabic tweets with English news articles (2 weeks): 7,763 English news articles ( Syria ) ( سوريا ( Syria, 61,633 Arabic written tweets from 10,186 users Arabic written keywords related to humanitarian crisis, e.g. violence, death, food, shelter, etc. to reduce tweets Pfeffer, J., Carley, K. M. (2012). Social Networks, Social Media, Social Change. Proceedings of the 2nd International Conference on Cross Cultural Decision Making: Focus 2012, San Francisco, CA. 12
13 Interplay Social Media/Traditional Media Data mining approach: Carlos Castillo (Qatar Computing Research Institute, Doha, Qatar) Mohammed El Haddad (Al Jazeera, Doha, Qatar) Matt Stempeck (MIT Media Lab, Cambridge, USA) Jürgen Pfeffer (Carnegie Mellon University, Pittsburgh, USA) 13
14 Data Collection AlJazeera.com beacon embedded in all article pages events are processed using Apache S4 collect and aggregate the visits with a 1 minute granularity data is stored using a Cassandra NoSQL database Facebook.com collect messages from Facebook discussing the articles using the Facebook Query Language API Twitter.com collect messages from Twitter discussing the articles Using the Twitter Search API 14
15 Data Collection Case Study, 1 week of data: Number of articles 606 Visits after 7 days 3.6 M Facebook shares 155 K# Tweets 80 K Where do the article visits come from 15
16 Interplay Social Media/Traditional Media Castillo, Carlos & El-Haddad, Mohammed & Pfeffer, Jürgen & Stempeck, Mat (2014, forthcoming). Characterizing the Life Cycle of Online News Stories Using Social Media Reactions. 17th ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW 2014), February 15-19, Baltimore, Maryland. 16
17 Interplay Traditional and Social Media Describing life cycle of online news stories Using early social media reactions 20 minutes of Social Media activities Can we estimate the 7 day visiting volume? Results: Social media reactions can contribute substantially to the understanding of visitation patterns in online news. After 20 Minutes In-depth News Facebook shares * * Twitter avg. followers *** - Volume of unique tweets - *** Twitter entropy *** *** 17 17
18 Al Jazeera Web Analytics Platform Al Jazeera launches predictive web analytics platform based on our research Media coverage: Qatar Tribune Doha News Gulf Times Fana News Albawaba Wan Ifra Rapid TV News Etc. 18
19 Big Data Principles: Collect All Data Collect all available data No sampling, N = all There are no unrelated data Messy data and bad data is good Thousands of ( independent ) variables We (the system) can decide later what is useful and what not 19
20 Data Driven Research Processes Social Science 1. Problem 2. Research Question/ Hypotheses 3. Theories 4. Methods 5. Data 6. Analysis 7. Result Presentation Typical Big Data Analysis 1. Methods 2. Data 3. Analysis 4. Result Presentation 5. Problem 20
21 Correlation not Cause: Babies and Storks Social Science Collect other (sociodemographic) variables Build hypotheses about underlying variables Figure out that education is a good predictor for babies and storks (non cities) Question: Why? Big Data Analysis Include ~1,200 variables in a regression like model. Number of storks and avg. car gas consumption are good enough predictors for number of babies Goodness of fit 21
22 Many Variables: Statistical Issues I 1 st example: 1 variable y, 100 elements, random variable x, 100 elements, random 0 1 Cor(x,y) = ~ nd example: 1 variable y, 100 elements, random variable x n, 100 elements, random 0 1 Cor(x n,y) =? Cor(x n,y) Something always correlates x n 22
23 Many Variables: Statistical Issues II 1 st example: 1 variable y, 100 elements, random variable x, 100 elements, random 0 1 r² lm(x,y) = ~.0 2 nd example: 1 variable y, 100 elements, random variable x n, 100 elements, random 0 1 r² lm(x 1 x n,y) =? r² If you use enough variables, your r² is always high Number of variables 23
24 N = All Is it all? All of what? Is it all of what we want? Is it all of what we think it is? 24
25 Multi Level Bias Problem 1. Do the people online represent society? 2. Do the people that are online behave like offline? 3. Do the created data represent human behavior? 4. Do the analyzed data represent the created data? C A B 25
26 Do Created Data Represent Human Behavior? Pfeffer, J. & Zorbach, T. & Carley, K.M. (2013). Understanding online firestorms: Negative word of mouth dynamics in social media networks. Journal of Marketing Communications 26
27 Empirical Observations/Factors Hundreds of friends create many information Offline: Hierarchical groups of alters (Zhou et al., 2005) Strength of ties amount of time, the emotional intensity, the intimacy, and the reciprocal service (Granovetter, 1973) In social media, every connection gets the same amount of attention Massive unrestrained information flow 27
28 Empirical Observations/Factors Amplified epidemic spreading, network clusters Average Facebook user Ann: 130 friends Ben posts a very interesting piece of information Ben s friends like what Ben says (Homophily) Ben s friends are also friends with Ann (Transitivity) Ann receive a large amount of posts to one topic Amplifying effects of opinion forming: echo chambers (Key, 1966) Network clusters & echo chambers 28
29 Empirical Observations/Factors Amplified epidemic spreading, network clusters Transitive link creations (Heider, 1946) Interpersonal communication networks have significant local clustering (e.g. Pfeffer and Carley, 2011) Local clusters are important for diffusion (e.g. Pfeffer and Carley, 2013) C A B Pfeffer, Jürgen & Carley, Kathleen M. (2013). The Importance of Local Clusters for the Diffusion of Opinions and Beliefs in Interpersonal Communication Networks. International Journal of Innovation and Technology Management 10 (5) Pfeffer, Jürgen & Carley, Kathleen M. (2011). Modeling and Calibrating Real World Interpersonal Networks. Proceedings of the IEEE NSW 2011, 1st International Workshop on Network Science,
30 Do Created Data Represent Human Behavior? Transitive link creations (Heider, 1946) Programmers of social media know this E.g. ~70% of new links on LinkedIn are triadic closure Groups to follow, etc. Social media systems intensify this effect with link suggestions Do we analyze society or a software implementation? C A B 30
31 Do the Analyzed Data Represent the System? Morstatter, F. & Pfeffer, J.& Liu, Huan & Carley, K.M. (2013). Is the Sample Good Enough? Comparing Data from Twitter's Streaming API with Twitter's Firehose. ICWSM, Boston, MA. 31
32 Sampling Twitter Data Firehose feed 100% costly. Streaming API feed 1% free. We don t know how Twitter samples data. Is the sampled data from the Streaming API representative of the true activity on Twitter s Firehose? 32
33 Sampling Twitter Data 33
34 Sampling Twitter Data 42% Overall Coverage Daily Coverage from 17% to 89%. Can we find the right key player? Measure k Average Agreement (min-max) All 28 Days In-Degree (0-9) 4 In-Degree (36-82) 73 Betweenness (41-81) 55 Potential Reach (32-83) 80 34
35 Multi Level Bias Problem? Do the people online represent society?? Do the people that are online behave like offline?? Do the created data represent human behavior?? Do the analyzed data represent the created data? C A B 35
36 How Big are Big Data? Facebook is collecting your data 500 terabytes a day 2.5 billion status updates, posts, photos, videos, comments per day 2.7 billion Likes per day 300 million photos uploaded per day $10M $20M/year is collecting your data 500 terabytes a day/ Total world Internet traffic in 2012: 1.1 Exabytes per day (1000petabytes = 1 million terabytes = 1 billion gigabytes) Store just metadata! 36
37 Big Networks: Memory Space 1 billion vertices, 100 billion edges 111 PB adjacency matrix 2.92 TB adjacency list 2.92 TB edge list Burkhardt & Waring, An NSA Big Graph experiment 37
38 Top Supercomputer Installations Titan Cray XK7 at ORNL #1 Top500 in million cores 710 TB memory 8.2 Megawatts 4300 sq.ft. Sequoia IBM Blue Gene/Q at LLNL #1 Graph500 in million cores 1 PB memory 7.9 Megawatts 3000 sq.ft. $7 million per year energy costs Source: Burkhardt & Waring, An NSA Big Graph experiment 38
39 Calculation Time State of the art algorithms for SNA metrics require Θ space and run in Θ time, some in Θ or Θ with n = number of nodes, m = number of edges. Example: 50k nodes, 193k edges Betweenness centrality (Freeman 1979) 1 processor, laptop: min 39
40 Calculation Time Betweenness Centrality with Facebook 264,399,256,813 min (500k years) With 1,000,000 cores: 0.5 years With 10x faster cores: 18.4 days Approximations and localized algorithms 40
41 Large Networks? 41
42 Large Networks? 42
43 Large Networks: New Algorithms? What are the interpretations of our traditional measures for large networks? E.g., does node nr. 1 sit in the center or rather on the fence? What does it mean to be the most central actor on Facebook? Approximation algorithms are new metrics! What are the research questions for large networked data at all? 43
44 Summary Big hopes and dreams related to big data especially to social media data Data driven research is different Combining this with traditional social science research is not trivial Multi level bias problem of social media data Sampling issues Representative Big data are really big But it is possible to handle big data/networks Many metrics have been developed for small groups, validity for big data is often not guaranteed 44
45 Other Issues Privacy Surveillance You are the product not the customer When correlations lead to predictions and interventions Predictive policing 45
46 What needs to be done? Don t be blinded by big data Ask questions: What do we learn from a study? Do the authors ask why? Good old research process is still important Don t be satisfied with one needle (especially, when you dream of the haystack) Let s utilize big data! But with care
47 Conclusions: Mixed Methods The digital records of online behavior and social interaction hold the promise of opening up a new era in the social and behavioral sciences, but when and whether this opportunity is realized may depend on the involvement and leadership of sociologists with the necessary technical and computational skills. Online data should therefore be viewed as a complement to, and not substitute for, data collected by traditional methods. Indeed, in many cases, the value of online data may depend on opportunities to integrate with data obtained from surveys. Golder, S. A., & Macy, M. W. (2012, January). Social science with social media. ASA footnotes, 40(1), 7. 47
48 Conclusions Initially, computational social science needs to be the work of teams of social and computer scientists. In the long run, the question will be whether academia should nurture computational social scientists, or teams of computationally literate social scientists and socially literate computer scientists. Lazer, D., Pentland, A., Adamic, L., Aral, S., Barabási, A. L., Brewer, D., Christakis, N., Contractor, N., Fowler, J., Gutmann, M., Jebara, T., King, G., Macy, M., Roy, D., & Van Alstyne, M. (2009). Computational social science. Science, 323, Ph.D. program in Computation, Organizations and Society (COS) Computing About and For Society Apply: 48
49 Our mission is to go forward, and it has only just begun. There's still much to do, still so much to learn. Engage! Jean-Luc Picard, TNG Season 1 Ep. 26 Jürgen Pfeffer [email protected]
The bad and the ugly: Online firestorms and hate propagation in social media networks
The bad and the ugly: Online firestorms and hate propagation in social media networks Jürgen Pfeffer Institute for Software Research School of Computer Science Carnegie Mellon University Center for Computational
The Big Data Paradigm Shift. Insight Through Automation
The Big Data Paradigm Shift Insight Through Automation Agenda The Problem Emcien s Solution: Algorithms solve data related business problems How Does the Technology Work? Case Studies 2013 Emcien, Inc.
Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank
Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Agenda» Overview» What is Big Data?» Accelerates advances in computer & technologies» Revolutionizes data measurement»
IC05 Introduction on Networks &Visualization Nov. 2009. <[email protected]>
IC05 Introduction on Networks &Visualization Nov. 2009 Overview 1. Networks Introduction Networks across disciplines Properties Models 2. Visualization InfoVis Data exploration
Advanced Big Data Analytics with R and Hadoop
REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional
DIGITS CENTER FOR DIGITAL INNOVATION, TECHNOLOGY, AND STRATEGY THOUGHT LEADERSHIP FOR THE DIGITAL AGE
DIGITS CENTER FOR DIGITAL INNOVATION, TECHNOLOGY, AND STRATEGY THOUGHT LEADERSHIP FOR THE DIGITAL AGE INTRODUCTION RESEARCH IN PRACTICE PAPER SERIES, FALL 2011. BUSINESS INTELLIGENCE AND PREDICTIVE ANALYTICS
Outline. What is Big data and where they come from? How we deal with Big data?
What is Big Data Outline What is Big data and where they come from? How we deal with Big data? Big Data Everywhere! As a human, we generate a lot of data during our everyday activity. When you buy something,
Large-Scale Data Processing
Large-Scale Data Processing Eiko Yoneki [email protected] http://www.cl.cam.ac.uk/~ey204 Systems Research Group University of Cambridge Computer Laboratory 2010s: Big Data Why Big Data now? Increase
An NSA Big Graph experiment. Paul Burkhardt, Chris Waring. May 20, 2013
U.S. National Security Agency Research Directorate - R6 Technical Report NSA-RD-2013-056002v1 May 20, 2013 Graphs are everywhere! A graph is a collection of binary relationships, i.e. networks of pairwise
Exploring Big Data in Social Networks
Exploring Big Data in Social Networks [email protected] ([email protected]) INWEB National Science and Technology Institute for Web Federal University of Minas Gerais - UFMG May 2013 Some thoughts about
Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme
Big Data Analytics Prof. Dr. Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany 33. Sitzung des Arbeitskreises Informationstechnologie,
Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA
Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA http://kzhang6.people.uic.edu/tutorial/amcis2014.html August 7, 2014 Schedule I. Introduction to big data
Collaborations between Official Statistics and Academia in the Era of Big Data
Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI [email protected] What
Can Flash help you ride the Big Data Wave? Steve Fingerhut Vice President, Marketing Enterprise Storage Solutions Corporation
Can Flash help you ride the Big Data Wave? Steve Fingerhut Vice President, Marketing Enterprise Storage Solutions Corporation Forward-Looking Statements During our meeting today we may make forward-looking
BIG DATA: BIG BOOST TO BIG TECH
BIG DATA: BIG BOOST TO BIG TECH Ms. Tosha Joshi Department of Computer Applications, Christ College, Rajkot, Gujarat (India) ABSTRACT Data formation is occurring at a record rate. A staggering 2.9 billion
Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料
Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料 美 國 13 歲 學 生 用 Big Data 找 出 霸 淩 熱 點 Puri 架 設 網 站 Bullyvention, 藉 由 分 析 Twitter 上 找 出 提 到 跟 霸 凌 相 關 的 詞, 搭 配 地 理 位 置
PROGRAM DIRECTOR: Arthur O Connor Email Contact: URL : THE PROGRAM Careers in Data Analytics Admissions Criteria CURRICULUM Program Requirements
Data Analytics (MS) PROGRAM DIRECTOR: Arthur O Connor CUNY School of Professional Studies 101 West 31 st Street, 7 th Floor New York, NY 10001 Email Contact: Arthur O Connor, [email protected] URL:
Big Graph Processing: Some Background
Big Graph Processing: Some Background Bo Wu Colorado School of Mines Part of slides from: Paul Burkhardt (National Security Agency) and Carlos Guestrin (Washington University) Mines CSCI-580, Bo Wu Graphs
AN INTRO TO DATA MANAGEMENT
AN INTRO TO DATA MANAGEMENT HOW TO USE DATA TO SUCCEED AT YOUR JOB by: TABLE OF CONTENTS Chapter 1: The Rise of Global Data Chapter 2: Taking Advantage of Big Data Chapter 3: Data Projects from Start to
The 4 Pillars of Technosoft s Big Data Practice
beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed
Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services
Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012 Viswa Sharma Solutions Architect Tata Consultancy Services 1 Agenda What is Hadoop Why Hadoop? The Net Generation is here Sizing the
What is Data Science? Girl Develop It! Meetup Renée M. P. Teate, March 2015
What is Data Science? { Girl Develop It! Meetup Renée M. P. Teate, March 2015 Let s start with: What is Data? http://upload.wikimedia.org/wikipedia/commons/f/f0/darpa _Big_Data.jpg https://encryptedtbn2.gstatic.com/images?q=tbn:and9gcs9dku3_tzi-swwyaqee5y0ehuvoiznsya_raknubbd0jyxpx7pw
ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012
Clustering Big Data Anil K. Jain (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012 Outline Big Data How to extract information? Data clustering
International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop
ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: [email protected]
Hadoop. http://hadoop.apache.org/ Sunday, November 25, 12
Hadoop http://hadoop.apache.org/ What Is Apache Hadoop? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using
Information Flow and the Locus of Influence in Online User Networks: The Case of ios Jailbreak *
Information Flow and the Locus of Influence in Online User Networks: The Case of ios Jailbreak * Nitin Mayande & Charles Weber Department of Engineering and Technology Management Portland State University
Big Analytics: A Next Generation Roadmap
Big Analytics: A Next Generation Roadmap Cloud Developers Summit & Expo: October 1, 2014 Neil Fox, CTO: SoftServe, Inc. 2014 SoftServe, Inc. Remember Life Before The Web? 1994 Even Revolutions Take Time
An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012
Big Data Buzzwords From A to Z By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012 Big Data Buzzwords Big data is one of the, well, biggest trends in IT today, and it has spawned a whole new generation
Big Data Analytics. Lucas Rego Drumond
Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 36 Outline
5 Point Social Media Action Plan.
5 Point Social Media Action Plan. Workshop delivered by Ian Gibbins, IG Media Marketing Ltd ([email protected], tel: 01733 241537) On behalf of the Chambers Communications Sector Introduction: There
Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014
Big Data Analytics An Introduction Oliver Fuchsberger University of Paderborn 2014 Table of Contents I. Introduction & Motivation What is Big Data Analytics? Why is it so important? II. Techniques & Solutions
Unlocking the Intelligence in. Big Data. Ron Kasabian General Manager Big Data Solutions Intel Corporation
Unlocking the Intelligence in Big Data Ron Kasabian General Manager Big Data Solutions Intel Corporation Volume & Type of Data What s Driving Big Data? 10X Data growth by 2016 90% unstructured 1 Lower
Extreme Computing. Big Data. Stratis Viglas. School of Informatics University of Edinburgh [email protected]. Stratis Viglas Extreme Computing 1
Extreme Computing Big Data Stratis Viglas School of Informatics University of Edinburgh [email protected] Stratis Viglas Extreme Computing 1 Petabyte Age Big Data Challenges Stratis Viglas Extreme Computing
Big Data: Study in Structured and Unstructured Data
Big Data: Study in Structured and Unstructured Data Motashim Rasool 1, Wasim Khan 2 [email protected], [email protected] Abstract With the overlay of digital world, Information is available
Finding a needle in Haystack: Facebook s photo storage IBM Haifa Research Storage Systems
Finding a needle in Haystack: Facebook s photo storage IBM Haifa Research Storage Systems 1 Some Numbers (2010) Over 260 Billion images (20 PB) 65 Billion X 4 different sizes for each image. 1 Billion
Temporal Visualization and Analysis of Social Networks
Temporal Visualization and Analysis of Social Networks Peter A. Gloor*, Rob Laubacher MIT {pgloor,rjl}@mit.edu Yan Zhao, Scott B.C. Dynes *Dartmouth {yan.zhao,sdynes}@dartmouth.edu Abstract This paper
Are You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
Big Systems, Big Data
Big Systems, Big Data When considering Big Distributed Systems, it can be noted that a major concern is dealing with data, and in particular, Big Data Have general data issues (such as latency, availability,
New Jersey Big Data Alliance
Rutgers Discovery Informatics Institute (RDI 2 ) New Jersey s Center for Advanced Computation New Jersey Big Data Alliance Manish Parashar Director, Rutgers Discovery Informatics Institute (RDI 2 ) Professor,
W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract
W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the
How To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
ONLINE SOCIAL NETWORK ANALYTICS
ONLINE SOCIAL NETWORK ANALYTICS Course Syllabus ECTS: 10 Period: Summer 2013 (17 July - 14 Aug) Level: Master Language of teaching: English Course type: Summer University STADS UVA code: 460122U056 Teachers:
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A SURVEY ON BIG DATA ISSUES AMRINDER KAUR Assistant Professor, Department of Computer
Big Data a threat or a chance?
Big Data a threat or a chance? Helwig Hauser University of Bergen, Dept. of Informatics Big Data What is Big Data? well, lots of data, right? we come back to this in a moment. certainly, a buzz-word but
SOCIAL MEDIA ADVERTISING STRATEGIES THAT WORK
SOCIAL MEDIA ADVERTISING STRATEGIES THAT WORK ABSTRACT» Social media advertising is a new and fast growing part of digital advertising. In this White Paper I'll present social media advertising trends,
ISSN: 2320-1363 CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS
CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS A.Divya *1, A.M.Saravanan *2, I. Anette Regina *3 MPhil, Research Scholar, Muthurangam Govt. Arts College, Vellore, Tamilnadu, India Assistant
DATA MINING TECHNIQUES FOR CRM
International Journal of Scientific & Engineering Research, Volume 4, Issue 4, April-2013 509 DATA MINING TECHNIQUES FOR CRM R.Senkamalavalli, Research Scholar, SCSVMV University, Enathur, Kanchipuram
Chapter 1. Contrasting traditional and visual analytics approaches
Chapter 1 Understanding Big Data Analytics In This Chapter Defining Big Data Understanding Big Data Analytics Contrasting traditional and visual analytics approaches The era of Big Data is upon us. The
Sources: Summary Data is exploding in volume, variety and velocity timely
1 Sources: The Guardian, May 2010 IDC Digital Universe, 2010 IBM Institute for Business Value, 2009 IBM CIO Study 2010 TDWI: Next Generation Data Warehouse Platforms Q4 2009 Summary Data is exploding
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next
Cray: Enabling Real-Time Discovery in Big Data
Cray: Enabling Real-Time Discovery in Big Data Discovery is the process of gaining valuable insights into the world around us by recognizing previously unknown relationships between occurrences, objects
Big Data Mining: Challenges and Opportunities to Forecast Future Scenario
Big Data Mining: Challenges and Opportunities to Forecast Future Scenario Poonam G. Sawant, Dr. B.L.Desai Assist. Professor, Dept. of MCA, SIMCA, Savitribai Phule Pune University, Pune, Maharashtra, India
MLg. Big Data and Its Implication to Research Methodologies and Funding. Cornelia Caragea TARDIS 2014. November 7, 2014. Machine Learning Group
Big Data and Its Implication to Research Methodologies and Funding Cornelia Caragea TARDIS 2014 November 7, 2014 UNT Computer Science and Engineering Data Everywhere Lots of data is being collected and
How To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 [email protected] www.scch.at Michael Zwick DI
The Data Engineer. Mike Tamir Chief Science Officer Galvanize. Steven Miller Global Leader Academic Programs IBM Analytics
The Data Engineer Mike Tamir Chief Science Officer Galvanize Steven Miller Global Leader Academic Programs IBM Analytics Alessandro Gagliardi Lead Faculty Galvanize Businesses are quickly realizing that
BIG DATA CHALLENGES AND PERSPECTIVES
BIG DATA CHALLENGES AND PERSPECTIVES Meenakshi Sharma 1, Keshav Kishore 2 1 Student of Master of Technology, 2 Head of Department, Department of Computer Science and Engineering, A P Goyal Shimla University,
Data Centric Computing Revisited
Piyush Chaudhary Technical Computing Solutions Data Centric Computing Revisited SPXXL/SCICOMP Summer 2013 Bottom line: It is a time of Powerful Information Data volume is on the rise Dimensions of data
DEVELOPING A SOCIAL MEDIA STRATEGY
DEVELOPING A SOCIAL MEDIA STRATEGY Creating a social media strategy for your business 2 April 2012 Version 1.0 Contents Contents 2 Introduction 3 Skill Level 3 Video Tutorials 3 Getting Started with Social
What is Data Science? Data, Databases, and the Extraction of Knowledge Renée T., @becomingdatasci, November 2014
What is Data Science? { Data, Databases, and the Extraction of Knowledge Renée T., @becomingdatasci, November 2014 Let s start with: What is Data? http://upload.wikimedia.org/wikipedia/commons/f/f0/darpa
Application Development. A Paradigm Shift
Application Development for the Cloud: A Paradigm Shift Ramesh Rangachar Intelsat t 2012 by Intelsat. t Published by The Aerospace Corporation with permission. New 2007 Template - 1 Motivation for the
A Big Data Analytical Framework For Portfolio Optimization Abstract. Keywords. 1. Introduction
A Big Data Analytical Framework For Portfolio Optimization Dhanya Jothimani, Ravi Shankar and Surendra S. Yadav Department of Management Studies, Indian Institute of Technology Delhi {dhanya.jothimani,
**NEW CLIENTS MAY NEED AN INITIAL SET- UP and ANALYSIS
Pricing Structure Social Media Management Packages * Starter Package: Social Media for 2 Channels Starting at: $650 /mo (That s $650 dollars worth of Organic Advertising!) * Business Owner Package: Social
We are Big Data A Sonian Whitepaper
EXECUTIVE SUMMARY Big Data is not an uncommon term in the technology industry anymore. It s of big interest to many leading IT providers and archiving companies. But what is Big Data? While many have formed
Banking On A Customer-Centric Approach To Data
Banking On A Customer-Centric Approach To Data Putting Content into Context to Enhance Customer Lifetime Value No matter which company they interact with, consumers today have far greater expectations
How To Understand The Benefits Of Big Data
Findings from the research collaboration of IBM Institute for Business Value and Saïd Business School, University of Oxford Analytics: The real-world use of big data How innovative enterprises extract
NetApp Big Content Solutions: Agile Infrastructure for Big Data
White Paper NetApp Big Content Solutions: Agile Infrastructure for Big Data Ingo Fuchs, NetApp April 2012 WP-7161 Executive Summary Enterprises are entering a new era of scale, in which the amount of data
CSC590: Selected Topics BIG DATA & DATA MINING. Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait
CSC590: Selected Topics BIG DATA & DATA MINING Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait Agenda Introduction What is Big Data Why Big Data? Characteristics of Big Data Applications of Big Data Problems
Big Data and Analytics: Challenges and Opportunities
Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif
Laurence Liew General Manager, APAC. Economics Is Driving Big Data Analytics to the Cloud
Laurence Liew General Manager, APAC Economics Is Driving Big Data Analytics to the Cloud Big Data 101 The Analytics Stack Economics of Big Data Convergence of the 3 forces Big Data Analytics in the Cloud
COMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411
Hadoop for Enterprises:
Hadoop for Enterprises: Overcoming the Major Challenges Introduction to Big Data Big Data are information assets that are high volume, velocity, and variety. Big Data demands cost-effective, innovative
Putting Apache Kafka to Use!
Putting Apache Kafka to Use! Building a Real-time Data Platform for Event Streams! JAY KREPS, CONFLUENT! A Couple of Themes! Theme 1: Rise of Events! Theme 2: Immutability Everywhere! Level! Example! Immutable
Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce
Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of
How To Use Hadoop For Gis
2013 Esri International User Conference July 8 12, 2013 San Diego, California Technical Workshop Big Data: Using ArcGIS with Apache Hadoop David Kaiser Erik Hoel Offering 1330 Esri UC2013. Technical Workshop.
Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing. October 29th, 2015
E6893 Big Data Analytics Lecture 8: Spark Streams and Graph Computing (I) Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing
Social-Sensed Multimedia Computing
Social-Sensed Multimedia Computing Wenwu Zhu Tsinghua University Multimedia Computing Search Recommend Multimedia Summarize Social Distribution... Sense from Social Preference Influence User behaviors
[Hadoop, Storm and Couchbase: Faster Big Data]
[Hadoop, Storm and Couchbase: Faster Big Data] With over 8,500 clients, LivePerson is the global leader in intelligent online customer engagement. With an increasing amount of agent/customer engagements,
Web Archiving and Scholarly Use of Web Archives
Web Archiving and Scholarly Use of Web Archives Helen Hockx-Yu Head of Web Archiving British Library 15 April 2013 Overview 1. Introduction 2. Access and usage: UK Web Archive 3. Scholarly feedback on
How Companies are! Using Spark
How Companies are! Using Spark And where the Edge in Big Data will be Matei Zaharia History Decreasing storage costs have led to an explosion of big data Commodity cluster software, like Hadoop, has made
Smart Policing Initiative Website and Social Media
Smart Policing Initiative Website and Social Media Vivian Chu, CNA Research Specialist Iris Gonzalez, CNA Project Manager February 8, 2012 This project was supported by Grant No. 2009-DG-BX-K021 awarded
http://cloud.dailymotion.com July 2014
July 2014 Dailymotion Cloud Positioning Two video platforms based on one infrastructure Dailymotion.com DELIVER, SHARE AND MONETIZE YOUR VIDEO CONTENT Online sharing videos platform Dailymotion Cloud CONCRETIZE
Big Data: Image & Video Analytics
Big Data: Image & Video Analytics How it could support Archiving & Indexing & Searching Dieter Haas, IBM Deutschland GmbH The Big Data Wave 60% of internet traffic is multimedia content (images and videos)
Big Data Executive Survey
Big Data Executive Full Questionnaire Big Date Executive Full Questionnaire Appendix B Questionnaire Welcome The survey has been designed to provide a benchmark for enterprises seeking to understand the
The Emerging Ethics of Studying Social Media Use with a Heritage Twist
The Emerging Ethics of Studying Social Media Use with a Heritage Twist Sophia B. Liu Technology, Media and Society Program, ATLAS Institute University of Colorado, Boulder, CO 80302-0430, USA [email protected]
EVERYTHING THAT MATTERS IN ADVANCED ANALYTICS
EVERYTHING THAT MATTERS IN ADVANCED ANALYTICS Marcia Kaufman, Principal Analyst, Hurwitz & Associates Dan Kirsch, Senior Analyst, Hurwitz & Associates Steve Stover, Sr. Director, Product Management, Predixion
SOCIAL NETWORK ANALYSIS EVALUATING THE CUSTOMER S INFLUENCE FACTOR OVER BUSINESS EVENTS
SOCIAL NETWORK ANALYSIS EVALUATING THE CUSTOMER S INFLUENCE FACTOR OVER BUSINESS EVENTS Carlos Andre Reis Pinheiro 1 and Markus Helfert 2 1 School of Computing, Dublin City University, Dublin, Ireland
