A data revolution for development? Giulio Quaggiotto UN Global Pulse Lab Manager, Jakarta @gquaggiotto



Similar documents
UN Global Pulse: Harnessing Big Data for a Revolution in Sustainable Development and Humanitarian Action Robert Kirkpatrick

BIG DATA FOR DEVELOPMENT: A PRIMER

Big Data for Development

Is big data the new oil fuelling development?

Big Data and Official Statistics The UN Global Working Group

Big Data for Development: What May Determine Success or failure?

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

Terms of Reference. Junior Data Engineer

PRIORITY AREAS FOR SOCIAL DEVELOPMENT PERSPECTIVES FROM AFRICA EUNICE G. KAMWENDO UNDP REGIONAL BUREAU FOR AFRICA

Annex: Concept Note. Big Data for Policy, Development and Official Statistics New York, 22 February 2013

Integrating a Big Data Platform into Government:

Big Data how it changes the way you treat data

SOCIAL MEDIA MONITORING AND SENTIMENT ANALYSIS SYSTEM

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

RESOLUTION. (Adopted on Committee Two of CNMUN 2010)

How To Map Human Dynamics With Social Media For Disaster Alerts

Sentiment Analysis on Big Data

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

Driving Better Marketing Results with Big Data and Analytics David Corrigan, IBM, Director of Product Marketing

Sources: Summary Data is exploding in volume, variety and velocity timely

BIG DATA : Big Opportunity or Big Threat for Official Statistics?* Jose Ramon G. Albert, Ph.D. Secretary General, NSCB jrg.albert@nscb.gov.

A U T H O R S : G a n e s h S r i n i v a s a n a n d S a n d e e p W a g h Social Media Analytics

Contest. Gobernarte: The Art of Good Government. Eduardo Campos Award Third Edition

Vivir en un mar de Datos 2015: Big Data una mirada Global Fundación Telefónica

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum

Doing Multidisciplinary Research in Data Science

Big Data and New Paradigms in Information Management. Vladimir Videnovic Institute for Information Management

Good morning. It is a pleasure to be with you here today to talk about the value and promise of Big Data.

ESS event: Big Data in Official Statistics

Text Mining - Scope and Applications

Big Data and Society: The Use of Big Data in the ATHENA project

Turn your information into a competitive advantage

Predictive Analytics: Turn Information into Insights

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

UNICEF/NYHQ /Noorani

JamiQ Social Media Monitoring Software

AFRICAN ECONOMIC CONFERENCE 2013

TERMS OF REFERENCE (TORs)

IAEA's Tools and Capacity Building for Energy Planning in Africa

CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof.

CORRALLING THE WILD, WILD WEST OF SOCIAL MEDIA INTELLIGENCE

Unlocking the Full Potential of Big Data

Big Data, Official Statistics and Social Science Research: Emerging Data Challenges

Central African Republic Country brief and funding request February 2015

Big Data. What is Big Data? Over the past years. Big Data. Big Data: Introduction and Applications

How is the World Bank harnessing Big Data for development? Isabelle Huynh Sr Operations Officer World Bank

Are You Ready for Big Data?

Big Data and Analytics: Challenges and Opportunities

CSC590: Selected Topics BIG DATA & DATA MINING. Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme

the beginner s guide to SOCIAL MEDIA METRICS

The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn

Beyond Watson: The Business Implications of Big Data

This Symposium brought to you by

General overview, and sources and uses of Big Data for urban and regional analysis

Are You Ready for Big Data?

Big Data Use Cases Update

FOR IMMEDIATE RELEASE Release #

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How Big Data is Different

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank

Tap into Hadoop and Other No SQL Sources

Customer Experience Management

The Power of Social Data: Transforming Big Data into Decisions. Andreas Weigend

Demystifying Big Data Government Agencies & The Big Data Phenomenon

Bruhati Technologies. About us. ISO 9001:2008 certified. Technology fit for Business

Social Media analysis: A very useful tool for trading and investing. Gerhard Lampen Head Sanlam itrade Online

G4S Africa. Andy Baker Regional President. G4S Africa

Big Data and Open Data

Social Media Marketing for Local Businesses

Developing a successful Big Data strategy. Using Big Data to improve business outcomes

How to Design and Update School Feeding Programs

What the Hell is Big Data?

Harnessing the Data Flood: Oracle s Visionary Platform from Device to Data Center. Chris Baker Senior Vice President Worldwide ISV/OEM Java Sales

Summary of GAVI Alliance Investments in Immunization Coverage Data Quality

The Challenges of Geospatial Analytics in the Era of Big Data

COMP9321 Web Application Engineering

A more equitable world for children

Real World Application and Usage of IBM Advanced Analytics Technology

Transcription:

A data revolution for development? Giulio Quaggiotto UN Global Pulse Lab Manager, Jakarta @gquaggiotto

The promise The challenge New data as a practice

www.transportbuzz.com

A radically different view http://www.youtube.com/watch?v=onzcjs1pjmk&noredirect=1

An example in action

The future is already here, it is just not very evenly distributed

4 ways to data innovation 1. Funding and investment for national statistical capacity, particularly in developing countries. 2. Exploring new data sources, including those sourced from individual citizens. 3. Harnessing advanced technologies, like visualization tools that make data more understandable. 4. Liberating data to unleash the analytical creativity of users and hold policymakers accountable. U.N. Deputy Secretary-General Jan Eliasson

The challenge

New data as a practice Those who have done it Dev t sector Those who talk about it

Quiz time

GLOBAL PULSE: A NETWORK OF LABS Pulse Lab NYC Est. 2010 Pulse Lab Kampala Est. 2013 Pulse Lab Jakarta Est. 2012

New data as a practice

BIG DATA FOR DEVELOPMENT: 3 OPPORTUNITIES 1. Situational Awareness: Real-time trend analysis of population activities and dynamics can inform the design and targeting of programmes and policies. 1. Early Warning: Predictive analysis and detection of anomalous events allows rapid response to crises. 1. Programme Evaluation: Real-time feedback from citizens, and measurement of behavior change, allows for adaptive course corrections in programmes and policies.

Types of data Example data sources Global Pulse works with: Social media data (blogs, forums, social media streams) Mobile network data (CDRs, top-ups) Radio feeds News media content Online search Postal data GPS data We gain access to this type of data through partnerships with private sector or academia.

People tweet about immunization People share informative information on immunization and vaccinations with their connected friends Types No. of tweets Contents 1453 Is it dangerous to have fever, swelling and pain after vaccines? Shared Links 962 Is it true MMR vaccine can lead to autism in children? 800+ MoH launched Pentavalent vaccine and booster Retweets 983 772 755 UNICEF reports that 30K-40K children are infected by measles every year in Indonesia. Fever, the initial symptom of measles, will increase within the first five days and then skin rash starts. Polio is a contagious disease, theoretically, which can infect all ages but children are most vulnerable.

People express opinions 2500 2000 1500 1000 500 Number of tweets per day 0 2012-01-01 2012-02-01 2012-03-01 2012-04-01 2012-05-01 2012-06-01 2012-07-01 2012-08-01 2012-09-01 2012-10-01 2012-11-01 2012-12-01 2013-01-01 2013-02-01 2013-03-01 2013-04-01 2013-05-01 2013-06-01 2013-07-01 2013-08-01 2013-09-01 2013-10-01 2013-11-01 2013-12-01 Reports in Media which prompts spikes in tweets [2013/12/01] Debates about assurance of halal products. [2013/12/03] Uncertainty whether some drugs contain pig substance [2013/12/06] MoH starts consultations related to halal certification. [2013/12/07] Debates over halal certificates of food. [2013/12/12] Confirmation that some drugs and vaccines may contain haram substance. [2013/12/12] MUI urges pharmacologists to replace haram process.

Situational awareness 3000 2000 1000 There are some autism cases after MMR vaccine A baby suddenly died after vaccine Is it dangerous to have fever, swelling, pain, after vaccine? China is investigating death cases of babies 0 June 2012 Oct 2012 Apr 2013 Dec 2013 Rank 2012-06-20 2012-10-08 2013-04-28 2013-12-23 1 Autism (213) Death(1030) Fever (1498) Death (224) 2 Death (5) Fever (14) Swelling (1494) Fever (3) 3 Sick (4) Sick (4) Pain (1491) Crying (1) 4 Fever (2) Crying (3) Autism (1011) Autism (1) 5 Crying (1) Fever (3) Fever (4) -

Early warning and rapid response Early warning Rapid response with actionable plan Detect people concerned about death after vaccine from Twitter 1200 1000 800 Disseminate correct information through Twitter via influential users @dr_piprim @dirgarambe @blogdoktor 600 400 200 0 Number of tweets of death

Finding extracts Gender in the workplace Workflow Feasibility Study Results Topic Brainstorming Promising topics -Permission to work -Perception: appropriateness of work -Discriminatory job requirement -Double burdens of working women Less promising topics -Cost to access employment -Lack of skills or education Taxonomy Design 4000 2000 200 100 Feasibility Analysis Enough tweet volume AND Sufficient relevant tweets Enough tweet volume OR Sufficient relevant tweets

Finding extracts Nowcasting Food Prices and Understanding Coping Mechanisms Project Workflow Results with Initial Methodologies Data Collection & Refinement Nowcasting food prices Correlation study between (a) Real-word commodity price and (b) Commodity price from Twitter Understanding coping mechanism Pattern discovery from (a) Commodity price changes and (b) Consumption patterns Information Extraction Correlation Study We confirm that people discuss commodity price in social media and test the feasibility of extracted information. Cooking oil is a commodity which directly affects people s life because it is a basic commodity in Indonesia, differently from condensed milk.

Sinabung Eruption (15 th Sep, 2013) Infographics Location : Karo regency, North Sumatra Elevation : 2,460 m above sea level Victims : BNPB (Indonesian National Board of Disaster Management) reported 15 people died, and more than 30,000 people evacuated Volume Dynamics from Twitter Period: 14/9/2013 and 10/2/2014 Total Twitter Posts: 151,448 Relevant Posts: 117,436 (78%) More than 10K tweets at the first eruption

Visualizing Displacement Due to Floods through Mobile Data Partners: WFP, Govt. of Mexico, Univ. of Madrid, Telefonica Project: Visual analytics to support improved targeting of humanitarian assistance during emergencies

CDRs population estimate vs census - state of Tabasco, Mexico Source: Telefonica

A real-time map of poverty in Cote d Ivoire? a) Abidjan b) Liberian border c) Roads to Mali and Burkina Faso d) Road to Ghana Ref: arxiv.org/abs/1309.4496: Evaluating Socio-Economic State Of A Country Analyzing Airtime Credit And Mobile Phone

Luminosity as a proxy for GDP output Chen & Nordhaus, Using luminosity data as a proxy for economics statistics, 2011

Understanding labour market flows Source: Using social media to measure labour market flows, March 2014

M ashing big data with sensors

and citizen generated data

Evaluating policies real time? A mobility index to evaluate H1N1 response in Mexico City Telefonica Research, 2011 (http://www.unglobalpulse.org/publicpolicyandcellphonedata)

Real time evaluation of advocacy Finding extracts

Big postal data 1+ billion records per year Real-time scans since early 2010 Traffic btw 150 countries: Letters Packages MoneyGram International Postal Traffic: Daily Weight (kg) 800000 700000 600000 500000 400000 300000 200000 letter-post parcel-post EMS Letter-Post MA Letter-Post Trend Parcel Post MA Parcel Post Trend Express MA Express Trend

Making it happen: New data as a practice

New data as a practice Those who have done it Dev t & Gov t Those who talk about it

Anatomy of a big data project Questions Data dive Prototype Pilot Real time tools

Project portfolio Category Status Names 1. Social media for social protection Active 2. Social media to understand public perception of immunization 3. Signals of discrimination in the workplace 4. Nowcasting food prices and understanding coping mechanisms Research projects Exploration Ad-hoc 5. Mapping socio economic vulnerability 6. Maternal health 7. Disaster response/resilience 8. Universal heath coverage/public service monitoring 9. Deforestation 10. Providing Real-Time Insights on Indonesian Post2015 Priorities

Skillset of a data team

Big Data Access Data Mining & Analysis Technologies Leveraging Partnerships Twitter (global, 500 million messages/day) Orange France Telecom (Ivory Coast, Senegal) Telenor (Bangladesh mobile money data) Telefonica (Mexico, Guatemala) MTN (Uganda) Real Impact (Cote d Ivoire, Rwanda, Zambia) Universal Postal Union (global postal flow data) Amazon Web Services (supercomputing) DataSift (data filtering) SAS (analytics & data visualization) Crimson Hexagon (data analysis) Data Science Expertise Université catholique de Louvain (call records analysis) Institut des Systèmes Complexes de Paris Ile-de- France (news media mining & filtering) Universidad Politécnica de Madrid (call records analysis) Stockholm University (research fellow) Karolinska Institutet (call records analysis) University of Sheffield (speech-to-text tools) Microsoft Research (social media analysis)

Infrastructure example: Geolocaton Augmentation UNGP Code + Plain tweets AWS S3 Plain tweets Hadoop MapReduce HDFS / Impala GeoNames database Geolocated tweets

Easy-to-use tools for real-time awareness

Working with us Horizon scanning Trainings/capacity building Secondments & residencies Joint prototyping Full research project

http:www.unglobalpulse.org Giulio Quaggiotto UN Global Pulse Lab Manager, Jakarta @gquaggiotto

3 roles for NSOs and big data 1. 3 rd party to certify statistical quality of new sources 2. Issue statistical best practices in the use of nontraditional sources and the mining of big data 1. Use non-traditional sources to augment (and perhaps replace) official series Source: Andrew Wyckoff, OECD

4 big data illusions 1.uncannily accurate results 2.N=all statistical sampling is obsolete 3.causation is passee 4. scientific or statistical models aren t needed the end of theory Source: Tim Harford, FT

Big data technologies Big data requires different technologies and infrastructure! Lots inter-connected computers for data storage Distributed processing for analytics One of Google data center (source: http://www.google.com/about/datacenters)

Big data applications Private Sector Item recommendation by Amazon Movie recommendation by Netflix Friend recommendation by Facebook Predicting loss of customers by mobile phone companies Optimizing Human Resources/hiring practices Public Policy Tax fraud detection Twitter earthquake monitoring Understanding perceptions and implications of policy decisions Now-casting food prices and inflation rates Big data applications already nearby us!

Data science Data science: a new discipline to analyze big data Data scientist: a practitioner of data science with various skills to analyze big data to find patterns, to generate insights, and to visualize results for a communication with others http://upload.wikimedia.org/wikipedia/commons/4/44/datasciencedisciplines.png