How big is big data? An introduc+on for developmental professionals and researchers

Size: px
Start display at page:

Download "How big is big data? An introduc+on for developmental professionals and researchers"

Transcription

1 How big is big data? An introduc+on for developmental professionals and researchers Sriganesh Lokanathan, LIRNEasia Young Scholar Seminar CPRsouth 2014 Maropeng, South Africa 7 September 2014 This work was carried out with the aid of a grant from the InternaHonal Development Research Centre, Canada and the Department for InternaHonal Development UK.

2 Yup, big data is huge and it will grow exponenhally In 2013 humans created 4.4 zetabytes of digihzed data 1 zetabye = bytes To put that in perspechve: That is as much data as contained in 202 trillion print issues of the Economist magazine If laid out each sheet from all the magazines side by side, it would cover the Earth 4.4 Hmes x202 trillion If we stored that data on 2- TB external hard drives it would require 2.3 billion of them Enough to take up the space of 1071 Great Pyramids x1071 Sources: Calculations by Lokanathan based on IDC (2014), Letouze (2014) 2

3 But the big data phenomenon is not just about the size 3

4 Big Data The Vs VELOCITY Rate of change of data VOLUME Size of data VARIETY Different forms of data VERACITY Uncertainty of data VALUE PotenDal value of data 4

5 What has facilitated the rise of the big data phenomenon? Vast drops in the cost of storing and retrieving informahon ExponenHal growth in computer power and memory data can reside in persistent memory instead of disk and tape Major improvements in techniques for performing machine learning and reasoning 5

6 Sources of big data AdministraHve data E.g. digihzed medical records, insurance records, tax records, etc. Commercial transachons E.g. Bank transachons, credit card purchases, supermarket purchases, online purchases, etc. Sensors and tracking devices E.g. road and traffic sensors, climate sensors, equipment & infrastructure sensors, mobile phones, satellite/ GPS devices, etc. Online achvihes/ social media E.g. online search achvity, online page views, blogs/ FB/ twiaer posts, online audio/ video/ images, etc. 6

7 Big data isn t necessarily a new area: the census is the original big data First recorded census occurred in Egypt (circa 3340 BC) But the tools didn t exist to fully exploit them The first tools invented for the 1890 US census It took 7 years to process the 1880 census data Herman Hollerith invents the tabulahng machine to process the 1890 census data Source: Adam Schuster 7

8 Today s Big data is tomorrow s data At its core, the big data phenomenon is really about data analyhcs, albeit at a scale that was not imaginable before 8

9 If we want comprehensive coverage of the populahon, what are the sources of big data in developing economies? AdministraHve data? E.g. digihzed medical records, insurance records, tax records, etc. Commercial transachons? E.g. Bank transachons, credit card purchases, supermarket purchases, online purchases, etc. Sensors and tracking devices? E.g. road and traffic sensors, climate sensors, equipment & infrastructure sensors, mobile phones, satellite/ GPS devices, etc. Online achvihes/ social media? E.g. online search achvity, online page views, blogs/ FB/ twiaer posts, online audio/ video/ images, etc. 9

10 Currently only mobile network big data has the widest possible populahon coverage 10

11 LIRNEasia s Big Data for Development (BD4D) Research LIRNEasia has negohated access to historical and anonymized telecom network big data from mulhple operators in Sri Lanka In the current research cycle we are conduchng exploratory research on answering a few social science queshons related to mobility and connectedness developing a framework with privacy and self- regulatory guidelines for the collechon, use and sharing of mobile phone data. hap://lirneasia.net/projects/bd4d/ 11

12 The data sets MulHple mobile operators in Sri Lanka have provided LIRNEasia access to 4 different types of meta- data: Call Detail Records (CDRs) Records of calls, SMS- es, Internet access AirHme recharge records Data sets do not include any Personally IdenHfiable InformaHon (PII). All phone numbers are anonymized and LIRNEasia does not maintain any mappings of idenhfiers to original phone numbers 12

13 What are some types of big data captured by mobile network operators? Call Detail Record (CDR) Records of all calls made and received by a person created mainly for the purposes of billing Similar records exist for all SMS- es sent and received as well as for all Internet sessions The Cell ID in turn has a lat- long posihon associated with it. 13

14 What are some types of big data captured by mobile network operators (contd.)? AirDme reload records Records of all airhme reloads performed by prepaid SIMs Each row corresponds to a record of one person s achvity: 14

15 How can we use mobile network big data for development? 15

16 We can understand the mobility paaerns of the populahon Low High Volume of people 16

17 We can understand land use characterishcs People leave digital traces when they use communicahon devices. Diurnal paaerns of user achvity in different locahons can be used to classify land use characterishcs Euclidean distance between two Hme series is used to cluster base stahons in an unsupervised manner using k- means algorithm 17

18 With even a simplishc clustering into just two regions we can clearly see the commercial areas Cluster 1 corresponds to more commercial area Cluster 2 corresponds to more residenhal (and mixed- used regions) 18

19 We can understand the geospahal distribuhon of the social networks in the country Each link represents the normalized volume of calls between two DSDs Divisional Secretariat Division (DSD) is a third level administrahve division; 331 in total in LK Low No. of calls High 19

20 We can understand community structures The social network is segregated such that overlapping connechons between communihes are minimized using a modularity approach For Sri Lanka, the ophmal number of communihes discovered by the algorithm was 11 20

21 How much do these communihes mesh with exishng administrahve boundaries? Southern and Northern provinces have the highest similarity to their respechve provincial boundaries. 14

22 How much do these communihes mesh with exishng administrahve boundaries (contd.)? Colombo district is clustered as a single community and Gampaha is merged with North Western Province 15

23 We can leverage the differenhal paaerns of airhme recharge. Average recharge amount Higher household income Lower household income Recharge frequency 23

24 to create poverty maps An example from Cote d Ivoir Source: Thoralf GuHerrez, GauHer Krings, Vincent D. Blondel, (2013). EvaluaHng socio- economic state of a country analyzing airhme credit and mobile phone datasets. Available at hap://arxiv.org/abs/ Average airtime recharge 24

25 What s the use of such work? Can bring Hmely evidence into the policy making process in developing economies 25

26 An example of Hmely policy- relevant evidence The image on the leq depicts relahve density of people in Colombo city and the surrounding regions at 1300 compared to 0000 (midnight the previous day) on a normal weekday. The yellow to red colors depict areas whose density has increased relahve to midnight. The blue color depicts areas whose density has decreased relahve to midnight (the darker the blue, the greater the loss in density). The clear areas are those where the overall density has not changed. 26

27 Our findings closely match results from expensive & infrequent transportahon surveys 27

28 Taking a broader view: the overall process of leveraging mobile network big data for development Mobile network big data (CDRs, Internet access usage, airhme recharge records) Construct behavioral variables (i) Mobility variables (ii) Social variables (iii) ConsumpHon variables Other data sources (i) Census data (ii) HIES data (iii) Survey maps (iv) TransportaHon schedules (v) ++++ Analytics (i) (ii) Insights Urban planning Crisis management & DRR (iii) Health monitoring & planning (iv) Socio- economic monitoring & planning 28

29 What are other potenhal data sources for big data for development?: Some examples 29

30 UN Global Pulse Global Post Twiaer ConversaHon hap://post2015.unglobalpulse.net/ Searches approximately 500mn tweets daily across 16 key development topics (represented by about 25k keywords in mulhple languages) 30

31 Google Dengue Trends (hap:// Uses global search achvity Follow- up to the successful (or not) Google Flu Trends (hap:// ), but more on that later 31

32 That s all cool, but who can get such work done? Answer: collaboradve inter- disciplinary teams Data mining / computer science InformaHon science Econometrics/ stahshcs Social science (economics, polihcal science, sociology, anthropology, etc.) Domain experhse (e.g. urban & transport policy, etc.) 32

33 Spillovers from the big data phenomenon Greater emphases on data driven decision making Beaer tools for data extrachon, analysis and visualizahon even for small data PDF to excel convertors e.g. Tabula hap://tabula.nerdpower.org/ Google Fusion Tables haps://support.google.com/fusiontables/answer/ VisualizaHon: Data Driven Documents (D3) hap:// InfoVis hap://philogb.github.io/jit/ 33

34 AnalyHcal challenges of working with big data Data provenance & data cleaning RepresentaHveness Behavioral change Ground context CausaHon vs. correlahon The role of small data for verificahon as well as for bootstrapping Transparency & replicability 34

35 Data provenance Data provenance is the process of tracing the pathways taken by data from the originahng source through all the processes that may have mutated and/or replicated the data up unhl it is uhlized for the analyses in queshon How relevant is it for big data analyses? Quite relevant though it may not be feasible to establish provenance to the extent desired by the scienhfic community E.g. when analyzing CDRs, some operators may have mulhple records for forwarded calls. Not knowing this, can potenhally affect social network analysis 35

36 Data cleaning Same steps apply for big data as for small data : First, verify that quanhtahve and categorical variables are coded as it should be Second, remove outliers. 36

37 Is the data representahve? Large volume may make sampling rate irrelevant but doesn t necessarily make the data representahve QuesHon: can you think of some representahveness issues in reference to mobile network big data? 37

38 Changes in behavior Once you know the process people will try to beat it E.g. Google page rank and Search Engine OpHmizaHon (SEO) DigiHzed online behavior can be subject to self- censorship and the creahon of mulhple personas E.g. your Facebook and Twiaer posts can mirror how narcissishc you are hap:// /University- study- finds- Facebook- Twiaer- fuel- narcissism- different- ways.html TransacHonal data (by- product of doing something else e.g. making phone calls) is therefore the best form of behavioral data But even transachonal data can be subject to behavioral changes QuesHon: can you think of an example with respect to mobile network big data? 38

39 Ground context Knowing and understanding real world context is important, otherwise you might make false inferences Example from Sri Lanka: The majority of airhme reloads occur through scratch cards. Higher denominahon cards are not easily available QuesHon: Based on what you learnt earlier about how to differenhate between those form lower and higher income households, can you idenhfy a potenhal problem when trying to develop poverty maps from mobile network big data? 39

40 CausaHon versus CorrelaHon There are oqen more police in precincts with high crime, but that does not imply that increasing the number of police in a precinct would increase crime Source: Varian, H. R. (2013). Big Data: New Tricks for Econometrics. Available at hap://people.ischool.berkeley.edu/~hal/papers/2013/ml.pdf Big Data analyses can ONLY reveal correlahon NOT causahon You can get to causal effect in a Big Data world by running experiments, but you and I cannot do that easily 40

41 But in some instances correlahon can be enough to take achons E.g. Street Bump mapp in Boston hap:// Released by Boston City Hall for smartphones Once loaded on smartphones, uses the phone s accelerometer to detect potholes when users are driving, and then nohfies City Hall of locahon. Certain changes in accelerometer readings correlates with passing over a pothole, but no guarantee of causal effect. QuesHon: Why is correlahon enough to take achon in this case? 41

42 The role of tradihonal small data in verificahon and bootstrapping VerificaHon: E.g. we need transport survey data to verify if the paaerns of mobility discovered using mobile network big data meshes with real condihons Bootstrapping E.g. We can use mobile network big data from around the Hme of the census to build models to approximate socio- economic levels from just mobile network big data. This is then used to reverse engineer approximate census/ poverty maps in future when other data is not easily available. See Frias- MarHnez, V., & Virseda, J. (2012). On the relahonship between socio- economic factors and cell phone usage. In both cases, big data is complement and not a subshtute for small data, except in some cases. 42

43 Transparency & replicability A lot of intereshng big data sources are held by private enhhes The benefits of peer- review not yet easily available for much big data based research Less transparency and replicability means less chance of catching errors and improving the models 43

44 An illustrahon of the challenges using the example of Google Flu Trends (GFT) Since being launched successfully in 2008, subsequent research by others has found problems with how well GFT can predict incidence of Flu Problems: Original results based on bootstrapping with data from CDC, but there was no subsequent fine- tuning Over Hme more and more people started using Google to search for informahon on health issues, creahng rebound effects post introduchon of GFT Influenza- like illness rates did not necessarily correlate with actual influenza virus rates Original research arhcle did not reveal the specific 45 search terms that were used to build the model 44

45 AnalyHcal challenges of working with big data Data provenance & data cleaning RepresentaHveness Behavioral change Ground context CausaHon vs. correlahon The role of small data for verificahon as well as for bootstrapping Transparency & replicability For more discussion on the above see my guest blog post at h6p:// challenges- big- data 45

46 Other challenges of trying to leverage big data for development Gezng access to data Privacy 46

47 Gezng access Some, such as Twiaer data are accessible to all (with some volume limits) But for most private data sources, inihally it will be through ad hoc means, though that is changing E.g. Orange Data for Development compehhon & Telecom Italia Big Data challenge See hap:// & hap:// LIRNEasia and others working towards opening up the playing field for others 47

48 Privacy There are personal and collechve privacy implicahons The difficulty is that: Informed consent is meaningless in a big data world People actually don t really know what their generalizable privacy needs are or how their preferences might evolve Ways to deal with the privacy issue: AnonymizaHon (with caveats) Different levels of disaggregahon of shared data EvoluHon from a rights based approach to a harms based approach Also see: hap://lirneasia.net/wp- content/uploads/2014/08/draq- guidelines- 2.2.pdf 48

49 What you should have hopefully gained from today s lecture Some understanding of what big data means An appreciahon for some of the developmental opportunihes from leveraging big data using different sources A brief overview of the spillover effects of the big data phenomenon An appreciahon of some of the analyhcal and other challenges concerning the use of big data for development 49

50 Thank you. 50

Big data for beginners An introduction for developmental professionals and researchers

Big data for beginners An introduction for developmental professionals and researchers Big data for beginners An introduction for developmental professionals and researchers Sriganesh Lokanathan, LIRNEasia Research relevant to broadband policy and regulatory processes New Delhi, India 21

More information

How To Understand The Political And Social Situation In Sri Lanka

How To Understand The Political And Social Situation In Sri Lanka Mobile network big data for urban and transporta4on planning in Colombo, Sri Lanka Rohan Samarajiva & Sriganesh Lokanathan Data for Policy conference, 17 June 2015 This work was carried out with the aid

More information

11 th World Telecommunication/ICT Indicators Symposium (WTIS-13)

11 th World Telecommunication/ICT Indicators Symposium (WTIS-13) 11 th World Telecommunication/ICT Indicators Symposium (WTIS-13) Mexico City, México, 4-6 December 2013 Contribution to WTIS-13 Document C/21-E 6 December 2013 English SOURCE: TITLE: LIRNEasia Leveraging

More information

Big Data for Development: New opportunities for emerging markets

Big Data for Development: New opportunities for emerging markets Big Data for Development: New opportunities for emerging markets International JRC-IPTS Workshop Big Data for the Analysis of Digital Economy and Society Sevilla, 22 September 2015 This work was carried

More information

De la Business Intelligence aux Big Data. Marie- Aude AUFAURE Head of the Business Intelligence team Ecole Centrale Paris. 22/01/14 Séminaire Big Data

De la Business Intelligence aux Big Data. Marie- Aude AUFAURE Head of the Business Intelligence team Ecole Centrale Paris. 22/01/14 Séminaire Big Data De la Business Intelligence aux Big Data Marie- Aude AUFAURE Head of the Business Intelligence team Ecole Centrale Paris 22/01/14 Séminaire Big Data 1 Agenda EvoluHon of Business Intelligence SemanHc Technologies

More information

BIG DATA FUNDAMENTALS

BIG DATA FUNDAMENTALS BIG DATA FUNDAMENTALS Timeframe Minimum of 30 hours Use the concepts of volume, velocity, variety, veracity and value to define big data Learning outcomes Critically evaluate the need for big data management

More information

Statistical Challenges with Big Data in Management Science

Statistical Challenges with Big Data in Management Science Statistical Challenges with Big Data in Management Science Arnab Kumar Laha Indian Institute of Management Ahmedabad Analytics vs Reporting Competitive Advantage Reporting Prescriptive Analytics (Decision

More information

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Agenda» Overview» What is Big Data?» Accelerates advances in computer & technologies» Revolutionizes data measurement»

More information

Big Data, Official Statistics and Social Science Research: Emerging Data Challenges

Big Data, Official Statistics and Social Science Research: Emerging Data Challenges Big Data, Official Statistics and Social Science Research: Emerging Data Challenges Professor Paul Cheung Director, United Nations Statistics Division Building the Global Information System Elements of

More information

REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION

REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION Pilar Rey del Castillo May 2013 Introduction The exploitation of the vast amount of data originated from ICT tools and referring to a big variety

More information

Using Data Mining for Mobile Communication Clustering and Characterization

Using Data Mining for Mobile Communication Clustering and Characterization Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer

More information

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme Big Data Analytics Prof. Dr. Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany 33. Sitzung des Arbeitskreises Informationstechnologie,

More information

Collaborations between Official Statistics and Academia in the Era of Big Data

Collaborations between Official Statistics and Academia in the Era of Big Data Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI vnn@umich.edu What

More information

Now, Next and the Future: IT, Big Data and other Implications for RIM. Presented by Michael S. Smith / http://about.me/mikessmith

Now, Next and the Future: IT, Big Data and other Implications for RIM. Presented by Michael S. Smith / http://about.me/mikessmith Now, Next and the Future: IT, Big Data and other Implications for RIM Agenda for This Afternoon Now: What trends are creating implications within the profession? Next: Why is IT now concerned about RIM?

More information

SEO 2.0 ADVANCED SEO TIPS & TECHNIQUES ABSTRACT»

SEO 2.0 ADVANCED SEO TIPS & TECHNIQUES ABSTRACT» 2.0 ADVANCED TIPS & TECHNIQUES ABSTRACT» Savvy online marketers know that their website is a great tool for branding, content promotion and demand generation. And they realize that search engine optimization

More information

Big Data. What is Big Data? Over the past years. Big Data. Big Data: Introduction and Applications

Big Data. What is Big Data? Over the past years. Big Data. Big Data: Introduction and Applications Big Data Big Data: Introduction and Applications August 20, 2015 HKU-HKJC ExCEL3 Seminar Michael Chau, Associate Professor School of Business, The University of Hong Kong Ample opportunities for business

More information

Software Engineering for Big Data. CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo

Software Engineering for Big Data. CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo Software Engineering for Big Data CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo Big Data Big data technologies describe a new generation of technologies that aim

More information

BIG DATA, SMALL DATA, OPEN DATA

BIG DATA, SMALL DATA, OPEN DATA BIG DATA, SMALL DATA, OPEN DATA opportunities and challenges for governance BETHSIMONENOVECK Belk School of Business, Charlotte, North Carolina MAY132014 http://online.northcarolina.edu/unconline/courses.php

More information

How is the World Bank harnessing Big Data for development? Isabelle Huynh Sr Operations Officer World Bank

How is the World Bank harnessing Big Data for development? Isabelle Huynh Sr Operations Officer World Bank How is the World Bank harnessing Big Data for development? Isabelle Huynh Sr Operations Officer World Bank How is the World Bank harnessing Big Data for development? Isabelle Huynh Sr Operations Officer

More information

Crowdsourcing mobile networks from experiment

Crowdsourcing mobile networks from experiment Crowdsourcing mobile networks from the experiment Katia Jaffrès-Runser University of Toulouse, INPT-ENSEEIHT, IRIT lab, IRT Team Ecole des sciences avancées de Luchon Networks and Data Mining, Session

More information

Big Data-Challenges and Opportunities

Big Data-Challenges and Opportunities Big Data-Challenges and Opportunities White paper - August 2014 User Acceptance Tests Test Case Execution Quality Definition Test Design Test Plan Test Case Development Table of Contents Introduction 1

More information

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Siva Ravada Senior Director of Development Oracle Spatial and MapViewer 2 Evolving Technology Platforms

More information

Trusted Personal Data Management A User-Centric Approach

Trusted Personal Data Management A User-Centric Approach GRUPPO TELECOM ITALIA Future Cloud Seminar Oulu, August 13th 2014 A User-Centric Approach SKIL Lab, Trento - Italy Why are we talking about #privacy and #personaldata today? 3 Our data footprint Every

More information

Big Data & Analytics: Your concise guide (note the irony) Wednesday 27th November 2013

Big Data & Analytics: Your concise guide (note the irony) Wednesday 27th November 2013 Big Data & Analytics: Your concise guide (note the irony) Wednesday 27th November 2013 Housekeeping 1. Any questions coming out of today s presentation can be discussed in the bar this evening 2. OCF is

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

Data Storage - Should We Use It?

Data Storage - Should We Use It? BIG DATA MASTERCLASS DUBAI 25 FEBRUARY 2013 DR BRENDAN O DONOVAN HEAD OF STRATEGY (EMEA) BIG DATA SEEMS TO BE EVERYWHERE HOW TO TELL HYPE FROM REALITY? UN GLOBAL PULSE BIG DATA FOR DEVELOPMENT CANCER RESEARCH

More information

Big Data Analytics: 14 November 2013

Big Data Analytics: 14 November 2013 www.pwc.com CSM-ACE 2013 Big Data Analytics: Take it to the next level in building innovation, differentiation and growth 14 About me Data analytics in the UK Forensic technology and data analytics in

More information

ICT Perspectives on Big Data: Well Sorted Materials

ICT Perspectives on Big Data: Well Sorted Materials ICT Perspectives on Big Data: Well Sorted Materials 3 March 2015 Contents Introduction 1 Dendrogram 2 Tree Map 3 Heat Map 4 Raw Group Data 5 For an online, interactive version of the visualisations in

More information

BIG DATA: IT MAY BE BIG BUT IS IT SMART?

BIG DATA: IT MAY BE BIG BUT IS IT SMART? BIG DATA: IT MAY BE BIG BUT IS IT SMART? Turning Big Data into winning strategies A GfK Point-of-view 1 Big Data is complex Typical Big Data characteristics?#! %& Variety (data in many forms) Data in different

More information

!!! The Fallacy of Big Data! Brian Fine and Con Menictas!

!!! The Fallacy of Big Data! Brian Fine and Con Menictas! !!! The Fallacy of Big Data! Brian Fine and Con Menictas! 1! What is Big Data?! Big data is a vague term for a massive phenomenon that has rapidly become an obsession with entrepreneurs, scientists, governments

More information

Exploring New Methods of Data Gathering in Long-Distance Passenger Travel Data

Exploring New Methods of Data Gathering in Long-Distance Passenger Travel Data Exploring New Methods of Data Gathering in Long-Distance Passenger Travel Data Ben Pierce, Battelle June 8, 2011 1 Why Are New Methods Needed? Declining Response Rates Survey saturation in US Growing cell-only

More information

Understanding communities using mobile network big data

Understanding communities using mobile network big data Understanding communities using mobile network big data CPRsouth 2015 Kaushalya Madhawa, Sriganesh Lokanathan, Rohan Samarajiva, Danaja Maldeniya July 2015 LIRNEasia info@lirneasia.net 1 LIRNEasia is a

More information

CSC590: Selected Topics BIG DATA & DATA MINING. Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait

CSC590: Selected Topics BIG DATA & DATA MINING. Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait CSC590: Selected Topics BIG DATA & DATA MINING Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait Agenda Introduction What is Big Data Why Big Data? Characteristics of Big Data Applications of Big Data Problems

More information

BIG DATA : Big Opportunity or Big Threat for Official Statistics?* Jose Ramon G. Albert, Ph.D. Secretary General, NSCB Email: jrg.albert@nscb.gov.

BIG DATA : Big Opportunity or Big Threat for Official Statistics?* Jose Ramon G. Albert, Ph.D. Secretary General, NSCB Email: jrg.albert@nscb.gov. BIG DATA : Big Opportunity or Big Threat for Official Statistics?* Jose Ramon G. Albert, Ph.D. Secretary General, NSCB Email: jrg.albert@nscb.gov.ph 1 *Views expressed do not reflect those at NSCB Outline

More information

Small Business Webinar SEOTrainingSW.com 7 Secrets to Online Marketing

Small Business Webinar SEOTrainingSW.com 7 Secrets to Online Marketing Introduction: Jon Rognerud, the best selling author of The Ultimate Guide to Search Engine Optimization and one of the top SEO s in our industry was our guest during this 90- minute webinar focusing on

More information

Beyond Watson: The Business Implications of Big Data

Beyond Watson: The Business Implications of Big Data Beyond Watson: The Business Implications of Big Data Shankar Venkataraman IBM Program Director, STSM, Big Data August 10, 2011 The World is Changing and Becoming More INSTRUMENTED INTERCONNECTED INTELLIGENT

More information

Big Data for Development

Big Data for Development Big Data for Development Malarvizhi Veerappan Senior Data Scientist @worldbankdata @malarv Why is Big Data relevant for Development? In developing countries there are large data gaps, both in quantity

More information

WordPress SEO 101 http://philacsinclair.com

WordPress SEO 101 http://philacsinclair.com WordPress SEO 101 http://philacsinclair.com Copyright All rights reserved worldwide. YOUR RIGHTS: This book is restricted to your personal use only. It does not come with any other rights. LEGAL DISCLAIMER:

More information

USING SPATIAL DATA MINING TO DISCOVER THE HIDDEN RULES IN THE CRIME DATA

USING SPATIAL DATA MINING TO DISCOVER THE HIDDEN RULES IN THE CRIME DATA USING SPATIAL DATA MINING TO DISCOVER THE HIDDEN RULES IN THE CRIME DATA Karel, JANEČKA 1, Hana, HŮLOVÁ 1 1 Department of Mathematics, Faculty of Applied Sciences, University of West Bohemia Abstract Univerzitni

More information

EXECUTIVE REPORT. Big Data and the 3 V s: Volume, Variety and Velocity

EXECUTIVE REPORT. Big Data and the 3 V s: Volume, Variety and Velocity EXECUTIVE REPORT Big Data and the 3 V s: Volume, Variety and Velocity The three V s are the defining properties of big data. It is critical to understand what these elements mean. The main point of the

More information

Tuesday, August 13, 13

Tuesday, August 13, 13 Tuesday, August 13, 13 Search Marketing Service Search Engine Optimization Social Media Optimization Search Engine Marketing Newsletter and Blog Management Web Design /Development / Webmaster Tuesday,

More information

Matt Erickson Economist American Farm Bureau Federation March 5, 2014

Matt Erickson Economist American Farm Bureau Federation March 5, 2014 Matt Erickson Economist American Farm Bureau Federation March 5, 2014 What is Big Data? Discuss the intricacies of Big Data that exist today and what to expect ahead. Discuss the strengths, challenges

More information

Big Data Systems CS 5965/6965 FALL 2014

Big Data Systems CS 5965/6965 FALL 2014 Big Data Systems CS 5965/6965 FALL 2014 Today General course overview Q&A Introduction to Big Data Data Collection Assignment #1 General Course Information Course Web Page http://www.cs.utah.edu/~hari/teaching/fall2014.html

More information

BIG DATA: BIG BOOST TO BIG TECH

BIG DATA: BIG BOOST TO BIG TECH BIG DATA: BIG BOOST TO BIG TECH Ms. Tosha Joshi Department of Computer Applications, Christ College, Rajkot, Gujarat (India) ABSTRACT Data formation is occurring at a record rate. A staggering 2.9 billion

More information

(Big) Data Analytics: From Word Counts to Population Opinions

(Big) Data Analytics: From Word Counts to Population Opinions (Big) Data Analytics: From Word Counts to Population Opinions Mark Keane Insight@University College Dublin October 2014 ~ RSS ~ Edinburgh September 2014/EPIC 2 September 2014/EPIC 3 September 2014/EPIC

More information

What happens when Big Data and Master Data come together?

What happens when Big Data and Master Data come together? What happens when Big Data and Master Data come together? Jeremy Pritchard Master Data Management fgdd 1 What is Master Data? Master data is data that is shared by multiple computer systems. The Information

More information

Harnessing the Potential of Data Scientists and Big Data for Scientific Discovery

Harnessing the Potential of Data Scientists and Big Data for Scientific Discovery Harnessing the Potential of Data Scientists and Big Data for Scientific Discovery Ed Lazowska, University of Washington Saul Perlmu=er, UC Berkeley Yann LeCun, New York University Josh Greenberg, Alfred

More information

Data Centric Computing Revisited

Data Centric Computing Revisited Piyush Chaudhary Technical Computing Solutions Data Centric Computing Revisited SPXXL/SCICOMP Summer 2013 Bottom line: It is a time of Powerful Information Data volume is on the rise Dimensions of data

More information

Mining Big Data. Pang-Ning Tan. Associate Professor Dept of Computer Science & Engineering Michigan State University

Mining Big Data. Pang-Ning Tan. Associate Professor Dept of Computer Science & Engineering Michigan State University Mining Big Data Pang-Ning Tan Associate Professor Dept of Computer Science & Engineering Michigan State University Website: http://www.cse.msu.edu/~ptan Google Trends Big Data Smart Cities Big Data and

More information

Big Data for Social Good. Nuria Oliver, PhD Scientific Director User, Data and Media Intelligence Telefonica Research

Big Data for Social Good. Nuria Oliver, PhD Scientific Director User, Data and Media Intelligence Telefonica Research Big Data for Social Good Nuria Oliver, PhD Scientific Director User, Data and Media Intelligence Telefonica Research 6.8 billion subscriptions 96% of world s population (ITU) Mobile penetration of 120%

More information

SEO Workshop Today s Coach Lynn Stevenson. SEO Analyst

SEO Workshop Today s Coach Lynn Stevenson. SEO Analyst SEO Workshop Today s Coach Lynn Stevenson SEO Analyst Overview Introduction to SEO Importance of Content SEO Content Best Practices Keyword Research Optimizing Content Common Pitfalls Social Media and

More information

Crime Hotspots Analysis in South Korea: A User-Oriented Approach

Crime Hotspots Analysis in South Korea: A User-Oriented Approach , pp.81-85 http://dx.doi.org/10.14257/astl.2014.52.14 Crime Hotspots Analysis in South Korea: A User-Oriented Approach Aziz Nasridinov 1 and Young-Ho Park 2 * 1 School of Computer Engineering, Dongguk

More information

SEO Consulting Services By Cromosys. [Strategy & Plan]

SEO Consulting Services By Cromosys. [Strategy & Plan] SEO Consulting Services By Cromosys [Strategy & Plan] CROMOSYS SEO SERVICES WITH DETAILED STRATEGY & PLAN Cromosys offers SEO Services in terms of expertise and unique services. Cromosys will help you

More information

Understanding & Realizing Big Data Potential

Understanding & Realizing Big Data Potential Understanding & Realizing Big Data Potential 2014 Latin America Treasury & Finance Conference A Blueprint for a Digitally Connected Treasury Driss R. Temsamani Analytics & Innovation Head driss.r.temsamani@citi.com

More information

URBAN MOBILITY IN CLEAN, GREEN CITIES

URBAN MOBILITY IN CLEAN, GREEN CITIES URBAN MOBILITY IN CLEAN, GREEN CITIES C. G. Cassandras Division of Systems Engineering and Dept. of Electrical and Computer Engineering and Center for Information and Systems Engineering Boston University

More information

BIG DATA What it is and how to use?

BIG DATA What it is and how to use? BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14

More information

Cloud Computing and Big Data What s the Big Deal

Cloud Computing and Big Data What s the Big Deal Cloud Computing and Big Data What s the Big Deal Arlene Minkiewicz, Chief Scientist PRICE Systems, LLC arlene.minkiewicz@pricesystems.com Optimize tomorrow today. 1 Agenda Introduction Cloud Computing

More information

Introduction 3. Step One: Create a Keyword Strategy 4. Step Two: Optimize Your Website 7. Step Three: Create Blog and Other Content 14

Introduction 3. Step One: Create a Keyword Strategy 4. Step Two: Optimize Your Website 7. Step Three: Create Blog and Other Content 14 Table of Contents Introduction 3 Step One: Create a Keyword Strategy 4 Step Two: Optimize Your Website 7 Step Three: Create Blog and Other Content 14 Step Four: Promote Content & Participate in Social

More information

Impact of Social Media Marketing on SME Business Author:

Impact of Social Media Marketing on SME Business Author: Impact of Social Media Marketing on SME Business Author: Rahul Jain, Director Digital Marketing & Chief Strategist, InnoServ Digital, InnoServ Solutions Pvt. Ltd. ABSTRACT With the growth of Internet and

More information

Improving video search engine rankings

Improving video search engine rankings 5 THINGS YOU NEED TO KNOW ABOUT VIDEO SEO Improving video search engine rankings PlanetStream Ltd Churchfield House 36 Vicar Street Dudley West Midlands DY2 8RG United Kingdom helpdesk@planetstream.net

More information

The Social Impact of Open Data

The Social Impact of Open Data United States of America Federal Trade Commission The Social Impact of Open Data Remarks of Maureen K. Ohlhausen 1 Commissioner, Federal Trade Commission Center for Data Innovation The Social Impact of

More information

How To Improve Data Quality

How To Improve Data Quality Big Data Promises and Pitfalls David J. Hand Imperial College, London and Winton Capital Management July 2015 Policy making in the Big Data Era 1 The world of data is changing Not something which happens

More information

Overview. Bigness, in more formal terms. Big Data? ITNPBD4: Lecture 2. d i =<d i,j >, j =1:M. d i

Overview. Bigness, in more formal terms. Big Data? ITNPBD4: Lecture 2. d i =<d i,j >, j =1:M. d i ITNPBD4: Lecture 2 Leslie Smith Overview What makes big data big and why isn t this simply pure Mathematics? Cleanness and completeness of data Why all the fuss now? What s new technologically What do

More information

Copyright 2013 Splunk, Inc. Splunk 6 Overview. Presenter Name, Presenter Title

Copyright 2013 Splunk, Inc. Splunk 6 Overview. Presenter Name, Presenter Title Copyright 2013 Splunk, Inc. Splunk 6 Overview Presenter Name, Presenter Title Safe Harbor Statement During the course of this presentahon, we may make forward looking statements regarding future events

More information

monitoring and for development

monitoring and for development Measuring the Information Society Report 2014 Chapter 5. The role of big data for ICT monitoring and for development 5.1 Introduction One of the key challenges in measuring the information society has

More information

Using multiple models: Bagging, Boosting, Ensembles, Forests

Using multiple models: Bagging, Boosting, Ensembles, Forests Using multiple models: Bagging, Boosting, Ensembles, Forests Bagging Combining predictions from multiple models Different models obtained from bootstrap samples of training data Average predictions or

More information

Research Note What is Big Data?

Research Note What is Big Data? Research Note What is Big Data? By: Devin Luco Copyright 2012, ASA Institute for Risk & Innovation Keywords: Big Data, Database Management, Data Variety, Data Velocity, Data Volume, Structured Data, Unstructured

More information

Smart Cities. Opportunities for Service Providers

Smart Cities. Opportunities for Service Providers Smart Cities Opportunities for Service Providers By Zach Cohen Smart cities will use technology to transform urban environments. Cities are leveraging internet pervasiveness, data analytics, and networked

More information

1. Understanding Big Data

1. Understanding Big Data Big Data and its Real Impact on Your Security & Privacy Framework: A Pragmatic Overview Erik Luysterborg Partner, Deloitte EMEA Data Protection & Privacy leader Prague, SCCE, March 22 nd 2016 1. 2016 Deloitte

More information

What is Big Data? The three(or four) Vs in Big Data In 2013 the total amount of stored information is estimated to be Volume.

What is Big Data? The three(or four) Vs in Big Data In 2013 the total amount of stored information is estimated to be Volume. 8/26/2014 CS581 Big Data - Fall 2014 1 8/26/2014 CS581 Big Data - Fall 2014 2 CS535/CS581A BIG DATA What is Big Data? PART 0. INTRODUCTION 1. INTRODUCTION TO BIG DATA 2. COURSE INTRODUCTION PART 0. INTRODUCTION

More information

Big Data and utility function in bank services. Nikolay K. Vitanov 1

Big Data and utility function in bank services. Nikolay K. Vitanov 1 Big Data and utility function in bank services Selected aspects Nikolay K. Vitanov 1 1 Institute of Mechanics, Bulgarian Academy of Sciences Sofia, 16. 06. 2015 Vitanov (BAS) Big Data and utility function

More information

20 A Visualization Framework For Discovering Prepaid Mobile Subscriber Usage Patterns

20 A Visualization Framework For Discovering Prepaid Mobile Subscriber Usage Patterns 20 A Visualization Framework For Discovering Prepaid Mobile Subscriber Usage Patterns John Aogon and Patrick J. Ogao Telecommunications operators in developing countries are faced with a problem of knowing

More information

Analyzing Huge Data Sets in Forensic Investigations

Analyzing Huge Data Sets in Forensic Investigations Analyzing Huge Data Sets in Forensic Investigations Kasun De Zoysa Yasantha Hettiarachi Department of Communication and Media Technologies University of Colombo School of Computing Colombo, Sri Lanka Centre

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411

More information

The Impact of Big Data on Social Research David Rhind Sharon Witherspoon

The Impact of Big Data on Social Research David Rhind Sharon Witherspoon The Impact of Big Data on Social Research David Rhind Sharon Witherspoon 1 www.nuffieldfoundation.org The landscape to be covered What is Big Data? Just consultants hype? Key questions for SRA Technology

More information

Leveraging. Digital Marketing. And Connecting With Healthcare Consumers. Presented by: Brandi Unger and Kirstie Hamel

Leveraging. Digital Marketing. And Connecting With Healthcare Consumers. Presented by: Brandi Unger and Kirstie Hamel Leveraging Digital Marketing And Connecting With Healthcare Consumers Presented by: Brandi Unger and Kirstie Hamel Who Amplified Digital is a digital marketing agency that helps local businesses connect

More information

THANK YOU. Mia Araminta,

THANK YOU. Mia Araminta, THANK YOU Mia Araminta, Thank you for reading this proposal. I have reviewed the goals of this project and would like to submit this proposal for you to review. Please let me know if you have any questions

More information

Web Archiving and Scholarly Use of Web Archives

Web Archiving and Scholarly Use of Web Archives Web Archiving and Scholarly Use of Web Archives Helen Hockx-Yu Head of Web Archiving British Library 15 April 2013 Overview 1. Introduction 2. Access and usage: UK Web Archive 3. Scholarly feedback on

More information

Recent developments in EU Transport Policy

Recent developments in EU Transport Policy Recent developments in EU Policy Paolo Bolsi DG MOVE - Unit A3 Economic Analysis and Impact Assessment 66 th Session - Working Party on Statistics United Nations Economic Committee for Europe 17-19 June

More information

Search Engine Optimization A Beginner s Guide to Climbing Search Engine s Rankings

Search Engine Optimization A Beginner s Guide to Climbing Search Engine s Rankings THE ESSENTIAL MANUAL TO Search Engine Optimization A Beginner s Guide to Climbing Search Engine s Rankings A publication of fpg www.fame-production.com Table of Contents About the Author & Introduction

More information

CAP4773/CIS6930 Projects in Data Science, Fall 2014 [Review] Overview of Data Science

CAP4773/CIS6930 Projects in Data Science, Fall 2014 [Review] Overview of Data Science CAP4773/CIS6930 Projects in Data Science, Fall 2014 [Review] Overview of Data Science Dr. Daisy Zhe Wang CISE Department University of Florida August 25th 2014 20 Review Overview of Data Science Why Data

More information

An analysis of Big Data ecosystem from an HCI perspective.

An analysis of Big Data ecosystem from an HCI perspective. An analysis of Big Data ecosystem from an HCI perspective. Jay Sanghvi Rensselaer Polytechnic Institute For: Theory and Research in Technical Communication and HCI Rensselaer Polytechnic Institute Wednesday,

More information

Sources: Summary Data is exploding in volume, variety and velocity timely

Sources: Summary Data is exploding in volume, variety and velocity timely 1 Sources: The Guardian, May 2010 IDC Digital Universe, 2010 IBM Institute for Business Value, 2009 IBM CIO Study 2010 TDWI: Next Generation Data Warehouse Platforms Q4 2009 Summary Data is exploding

More information

One View Of Customer Data & Marketing Data

One View Of Customer Data & Marketing Data One View Of Customer Data & Marketing Data Ian Kenealy, Head of Customer Data & Analytics, RSA spoke to the CX Network and shared his thoughts on all things customer, data and analytics! Can you briefly

More information

46 th Session of the UNSC, Side Event organized by UNSD 4 March 2015, New York

46 th Session of the UNSC, Side Event organized by UNSD 4 March 2015, New York Big Data: How do we meet the expectations? 46 th Session of the UNSC, Side Event organized by UNSD 4 March 2015, New York The telecommunication/ict sector as a major source of big data Susan Teltscher

More information

Search Engine Optimisation Managed Service

Search Engine Optimisation Managed Service SEO Managed Service Search Engine Optimisation Managed Service Every day over 350 million searches are performed across the internet so it s imperative that your website stands out from the competition

More information

BUILD LINKS POST- PENGUIN

BUILD LINKS POST- PENGUIN intermediate How to BUILD LINKS POST- PENGUIN By KELVIN NEWMAN, SITEVISIBILITY SEO FROM THE EXPERTS series A publication of 2 IS THIS ebook RIGHT FOR ME? Not quite sure if this ebook is right for you?

More information

How to Drive More Traffic to Your Event Website

How to Drive More Traffic to Your Event Website Event Director s Best Practices Webinar Series Presents How to Drive More Traffic to Your Event Website Matt Clymer Online Marketing Specialist December 16 th 2010 Today s Speakers Moderator Guest Speaker

More information

HOW SOCIAL MEDIA IMPACTS SEO? a publication by

HOW SOCIAL MEDIA IMPACTS SEO? a publication by HOW SOCIAL MEDIA IMPACTS SEO? a publication by Authors Written by Sarah Bundy, Founder & CEO of All Inclusive Marketing, is an award winning leader in the performance marketing space. Sarah drives the

More information

The Big Deal about Big Data. Mike Skinner, CPA CISA CITP HORNE LLP

The Big Deal about Big Data. Mike Skinner, CPA CISA CITP HORNE LLP The Big Deal about Big Data Mike Skinner, CPA CISA CITP HORNE LLP Mike Skinner, CPA CISA CITP Senior Manager, IT Assurance & Risk Services HORNE LLP Focus areas: IT security & risk assessment IT governance,

More information

Cloud Computing and Big Data. What s the Big Deal?

Cloud Computing and Big Data. What s the Big Deal? Cloud Computing and Big Data. What s the Big Deal? Arlene Minkiewicz, Chief Scientist PRICE Systems, LLC arlene.minkiewicz@pricesystems.com 2013 PRICE Systems, LLC All Rights Reserved Decades of Cost Management

More information

Cloud Computing and Big Data

Cloud Computing and Big Data Cloud Computing and Big Data. What s the Big Deal? Arlene Minkiewicz, Chief Scientist PRICE Systems, LLC arlene.minkiewicz@pricesystems.com 2013 PRICE Systems, LLC All Rights Reserved Decades of Cost Management

More information

Anuradha Bhatia, Faculty, Computer Technology Department, Mumbai, India

Anuradha Bhatia, Faculty, Computer Technology Department, Mumbai, India Volume 3, Issue 9, September 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Real Time

More information

Rapid Visualization with Big Data Analytics. Ravi Chalaka VP, Solution and Social Innovation Marketing

Rapid Visualization with Big Data Analytics. Ravi Chalaka VP, Solution and Social Innovation Marketing Rapid Visualization with Big Data Analytics Ravi Chalaka VP, Solution and Social Innovation Marketing Imagine the Future Innovative cities that dramatically enhance the wellbeing of its citizens Safer

More information

UN Global Pulse: Harnessing Big Data for a Revolution in Sustainable Development and Humanitarian Action Robert Kirkpatrick Director @rkirkpatrick

UN Global Pulse: Harnessing Big Data for a Revolution in Sustainable Development and Humanitarian Action Robert Kirkpatrick Director @rkirkpatrick UN Global Pulse: Harnessing Big Data for a Revolution in Sustainable Development and Humanitarian Action Robert Kirkpatrick Director @rkirkpatrick www.unglobalpulse.org @unglobalpulse Global Pulse Vision:

More information

Internet Marketing Institute Delhi Mobile No.: 9643815724 DIMI. Internet Marketing Institute Delhi (DIMI)

Internet Marketing Institute Delhi Mobile No.: 9643815724 DIMI. Internet Marketing Institute Delhi (DIMI) Internet Marketing Institute Delhi () LEARN INTERNET MARKETING To earn money at home and work with big entrepreneurs Batches Weekday Batches (60 hours) Morning Batches (60 hours) Evening Batches (60 hours)

More information

Statistics for BIG data

Statistics for BIG data Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before

More information

The value of data analytics

The value of data analytics Data is the new oil. This phrase expresses in just a few words how data is changing our world just like oil did in the 20th century. According to Forbes, 89% of the business leaders think that (Big) Data

More information