1 Big Data for Development: What May Determine Success or failure? Emmanuel Letouzé OECD Technology Foresight 2012 Paris, October 22
3 Swimming in Ocean of data Data deluge Algorithms Drowning in Digital smoke signals Pattern recognition Petabytes Digital footprints Correlations Data exhaust Digital signatures Proxy indicators Unstructured data Sentiment analysis Digital crumbs Noise Physical sensors
4 Background and Question(s) 1. Complex, hyperconnected, volatile world. Shocks, risks, vulnerability, resilience, agility, have become mainstream. Policies need to be more agile. 2. Big Data. Big excitement. Big skepticism. New Industrial Revolution? New oil that needs to be refined? Big enough data speak for themselves? Big Data, Big Deal? Worse: a dangerous path? (think conflict settings)? Can it help / change development how? What is the opportunity, what are the challenges? A better question is: What will determine whether Big Data for Development succeeds or fails?
5 Big Data for Development=Success Enables Proximate determinants Requires 1. Right Intent 2. Right Capacities Enables Underlying drivers Recognizing/communicating around: 1. Potential and applications 2. Challenges and Risks 3. Necessary requirements Requires
6 From Big data Big Data for development + purpose (intent) + computing tools, systems (capacity) It s relative It s contextual
7 Big data for development: Global Pulse taxonomy Open Web Data Web content such as news media and social media interactions (e.g. blogs, Twitter), news articles obituaries, e- commerce, job postings; sensor of human intent, sentiments, perceptions, and want. Data Exhaust passively collected transactional data from people s use of digital services like mobile phones, purchases, web searches, etc., these digital services create networked sensors of human behavior; Citizen reporting information actively produced or submitted by citizens through mobile phone-based surveys, hotlines, user-generated maps, etc; Physical Sensors satellite or infrared imagery of changing landscapes, traffic patterns, light emissions, urban development and topographic changes, etc; remote sensing of changes in human activity
8 Potential applications according to Global Pulse Early warning: early detection of anomalies in how populations use digital devices and services can enable faster response in times of crisis ; Digital awareness: Big Data can paint a fine-grained and current representation of reality which can inform the design and targeting of programs and policies; Real-time feedback: monitor a population in real time makes it possible to understand where policies and programs are failing and make adjustments.
9 Intent Capacity Early warning digital smoke signals -syndromic surveillance; anomaly detection, extreme values.. Enables Requires Digital awareness Digital signatures, patternsunderstanding human ecosystems
10 Ex: Conflict prevention (highly simplified) Operational prevention, i.e. conflict early warning and response systems. Detect and respond to anomalies Structural prevention; addressing structural drivers of conflict (e.g., poverty, inequality, elite capture). understand the ecosystem through digital patterns/signatures
11 Examples Early warning, syndromic surveillance and proxies= pick up / model anomalies Twitter-based vs. Official Influenza Rate in the U.S. «You Are What You Tweet: Analyzing Twitter for Public Health. M. J. Paul and M. Dredze,
12 Modeling & predicting food riots
13 Finding proxy indicators (Global Pulse and SAS project) Tweets about the price of rice vs. actual price of rice in Indonesia
14 Tweets per day about meals during Ramadan Start of Ramadan End of Ramadan
15 Digital awareness = understand ecosystem Tea Party on Twitter Occupy Wall Street on Twitter The Tea Party appears as a far more tight-knit group (..) whereas Occupy is made up of looser clusters with a few high-profile accounts. In the lower right-hand corner = matrix of "isolates:" People who are talking about the ideas of Occupy or the Tea Party, but ( ) don't have connections to others on the graph. For Occupy, the number of isolates is greater, which (..) could indicate a larger potential for growth and stronger brand cachet. Source: M. Smith cited in FP What if we could do the same thing for militias, rebel groups not today, but in N years, where N=5, 10, 15..?
18 Summing up potential / applications i. digital signatures that help better understand human ecosystems ii. digital smoke signals for anomaly detection may be especially relevant in risky/poor places high rates of growth of tech penetration with typically poor / little more traditional data But also gives a sense of some of the risks and challenges posed by Big Data for Development
19 Recognizing the difficulties and requirements
20 1. Data challenges Data: How do we access data and protect their producers? Challenge #1: Availability: Depends on technological penetration and use. But mobile phone use is increasing fast and supports/will support Internet access. Challenge #2: Reliability. Attempts at playing the system(s), plus suppressing signals? Especially prevalent in conflict settings.
21 1. Data challenges Challenges #3: Access. Not all data produced in easily accessible, storable (e.g. Twitter). This is compounded in poor countries. How to access data from a mobile phone carrier? Challenge #4: Privacy. Because it s accessible does not make it ethical or safe. True in all settings. In conflict/crisis settings, the privacy challenge may soon become a security risk.
22 2. Analytical challenges Challenge #1: Arrogance/naivety. Believing that the truth is in the (big) data, that the (big enough) data speak for themselves, that all it takes is to beep digging and mining to unveil the truth, that more is always better etc. Apophenia seeing patterns where none exists. If you torture the data long enough, it will eventually start talking. S&P index and butter production in Bangladesh? (Leinweber, 2007)
23 2. Analytical challenges Challenge #2: Making sense of the data.. Not having the right computing capacities. Unstructured data. Sentiment analysis. Translation. Sample bias. No understanding of context.
24 3. Operational/systemic challenges Challenge #1: Legal. Data privacy? Intellectual property? Challenge #2: changing systems, policies, frameworks, decision making processes
25 2. Risks Risk #1: Security and privacy of individual and communities may be jeopardize dark side of Big Data. Conflict zones! Risk #2: Creating new digital divides, between the Small and the Big world, as well as within them, between communities. Who has access, who has the knowledge? Those who can answer the questions are often those who ask the questions. Risk #3: Respond foolishly. i.e. in a way that may exacerbate tensions, poverty..
26 What to avoid Vertical down top down a-theoritical no questions asked a-contextual automated, autarchic, unresponsive to the recommendations of practitioners and social scientists
27 Desirable principles and policies
28 Right intent, sound principles: Big Data is a tool that can help answer QUESTIONS start from questions Contextualization is key especially when lives are in the line. Rely on local insights to make sense of the data. Responsibility: do not harm. Privacy and security: privacy preserving data analytics Build on existing systems and knowledge, e.g. thinking around strategic peacebuilding Think incremental, complementary, iterative and long term
29 Right Capacities Policies and systems: Bring together data and social scientists, private sector and public sector, traditional statisticians and Big Data people Set up Small World-Big World interdisciplinary facilities and agreements to ask questions, test and exchange, draw lessons Create work flows as feedback loops Engage in debates on privacy, security, responsibility, design legal frameworks
30 Research on big data sample bias correction exists You are where you Using Data to Estimate International Migration Rates llustrative relationship between the correction factor for selection bias and Internet penetration rates for different choices of the parameter k.
33 Where may this take us? Lessons from the Technology Sector
35 Agile Global Development?
36 2000 Traditional planning-monitoring policymaking cycle Policymaking/De velopment cycle
37 2020 Responsive policymaking cycle?? Big Data Policymaking/De velopment cycle!
38 2020 Agile policymaking cycle?? All Data? All Data? All Data!!!
39 10 years ahead? Data streams will continue to have grown in volume, velocity and variety. It is very unlikely that Big Data won t be playing a significant role in Development. Question is: what do we want to achieve and how? Big Data for Development should start small, and grow.
Is big data the new oil fuelling development? 12th National Convention on Statistics Manila, Philippines 2 October, 2013 Johannes Jütting PARIS21 Big data (2 The future? Linked data: Is this the future?..
June 2013 BIG DATA FOR DEVELOPMENT: A PRIMER Harnessing Big Data For Real-Time Awareness WHAT IS BIG DATA? Big Data is an umbrella term referring to the large amounts of digital data continually generated
UN Global Pulse: Harnessing Big Data for a Revolution in Sustainable Development and Humanitarian Action Robert Kirkpatrick Director @rkirkpatrick www.unglobalpulse.org @unglobalpulse Global Pulse Vision:
BIG DATA FUNDAMENTALS Timeframe Minimum of 30 hours Use the concepts of volume, velocity, variety, veracity and value to define big data Learning outcomes Critically evaluate the need for big data management
DRAFT CIRCULATED FOR COMMENT BIG DATA AND THE 2030 AGENDA FOR SUSTAINABLE DEVELOPMENT FINAL DRAFT REPORT OPEN FOR COMMENT BY ABBAS MAAROOF, PhD Table of Contents Acknowledgements Executive Summary 1.Introduction...
www.pwc.co.uk Data analytics Delivering intelligence in the moment January 2014 Our point of view Extracting insight from an organisation s data and applying it to business decisions has long been a necessary
Scientist for 1h Silvia Quarteroni, NLP Expert Jérôme Berthier, Head of BI & Big Geneva, 22.04.2015 OUTLINE 1. ELCA 2. From myth to reality 3. Some proofs 4. Conclusion Visit us at our booth no. A23 21.04.2015
White Paper How Streaming Data Analytics Enables Real-Time Decisions Contents Introduction... 1 What Is Streaming Analytics?... 1 How Does SAS Event Stream Processing Work?... 2 Overview...2 Event Stream
Getting the Most Out of SIEM Presentation Title Data in Big Data Presented By: Dr. Char Sample, CERT Acknowledgements Dr. Ben Shniederman, UMD Big Data Big Insights George Jones, John Stogoski, CERT Alternatives
Big Data, Official Statistics and Social Science Research: Emerging Data Challenges Professor Paul Cheung Director, United Nations Statistics Division Building the Global Information System Elements of
Big Data Putting data to productive use Fast Forward What is big data, and why should you care? Get familiar with big data terminology, technologies, and techniques. Getting started with big data to realize
VOLUME 34 BEST PRACTICES IN BUSINESS INTELLIGENCE AND DATA WAREHOUSING FROM LEADING SOLUTION PROVIDERS AND EXPERTS PDF PREVIEW IN EMERGING TECHNOLOGIES POWERFUL CASE STUDIES AND LESSONS LEARNED FOCUSING
Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Agenda» Overview» What is Big Data?» Accelerates advances in computer & technologies» Revolutionizes data measurement»
The Intersection of Big Data, Data Science, and The Internet of Things Bebo White SLAC National Accelerator Laboratory/ Stanford University firstname.lastname@example.org SLAC is a US national laboratory operated
Big Data and New Paradigms in Information Management Vladimir Videnovic Institute for Information Management 2 "I am certainly not an advocate for frequent and untried changes laws and institutions must
CSC590: Selected Topics BIG DATA & DATA MINING Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait Agenda Introduction What is Big Data Why Big Data? Characteristics of Big Data Applications of Big Data Problems
Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI email@example.com What
Kuwait National Assembly Media Department SOCIAL MEDIA MONITORING AND SENTIMENT ANALYSIS SYSTEM Dr. Salah Alnajem Associate Professor of Computational Linguistics and Natural Language Processing, Kuwait
Enhance Collaboration and Data Sharing for Faster Decisions and Improved Mission Outcome Richard Breakiron Senior Director, Cyber Solutions Rbreakiron@vion.com Office: 571-353-6127 / Cell: 803-443-8002
Accelerate your Big Data Strategy Execute faster with Capgemini and Cloudera s Enterprise Data Hub Accelerator Enterprise Data Hub Accelerator enables you to get started rapidly and cost-effectively with
Statistical Challenges with Big Data in Management Science Arnab Kumar Laha Indian Institute of Management Ahmedabad Analytics vs Reporting Competitive Advantage Reporting Prescriptive Analytics (Decision
SOCIAL MEDIA LISTENING AND ANALYSIS Spring 2014 EXECUTIVE SUMMARY In this digital age, social media has quickly become one of the most important communication channels. The shift to online conversation
FALL 2012 VOL.54 NO.1 Thomas H. Davenport, Paul Barth and Randy Bean How Big Data is Different Brought to you by Please note that gray areas reflect artwork that has been intentionally removed. The substantive
Protecting Resilience through Social Media Data - A White Paper 1 / thinkrise.com Nandu Govindankutty Customer Engagement Director, Barclays Fahad Kamr CEO, Market IQ Oliver Stevenson Community & Communications
A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS Stacey Franklin Jones, D.Sc. ProTech Global Solutions Annapolis, MD Abstract The use of Social Media as a resource to characterize
Exploiting Data at Rest and Data in Motion with a Big Data Platform Sarah Brader, firstname.lastname@example.org What is Big Data? Where does it come from? 12+ TBs of tweet data every day 30 billion RFID tags
Machine Data Analytics with Sumo Logic A Sumo Logic White Paper Introduction Today, organizations generate more data in ten minutes than they did during the entire year in 2003. This exponential growth
Case Study: Implementing Enterprise Content Management at Marathon Oil Reid G. Smith ECM Director & IT Upstream Services Manager Marathon Oil Corporation March 12, 2012 Who we are Global independent exploration
Text and data analytics for social network mining Goran Nenadic John Keane, Xiao-Jun Zeng School of Computer Science G.Nenadic@manchester.ac.uk Manchester Institute of Biotechnology Topics Aim and tasks
Big Data for Social Good Nuria Oliver, PhD Scientific Director User, Data and Media Intelligence Telefonica Research 6.8 billion subscriptions 96% of world s population (ITU) Mobile penetration of 120%
Some Economics of Cultural PSI: the Micro Perspective Massimiliano Nuccio Research Affiliate ASK Bocconi Research Centre Bocconi University Milan - 10 October 2014 1 Agenda Which new sources of data can
The Big Picture on Big Data Princeton Section 307 Dinner Meeting December 11, 2013 Richard Herczeg Objective of Talk 1. Deliver a Primer on Big Data. 2. How does this emerging topic apply to Quality? 3.
Big Data and Its Implication to Research Methodologies and Funding Cornelia Caragea TARDIS 2014 November 7, 2014 UNT Computer Science and Engineering Data Everywhere Lots of data is being collected and
What do Big Data & HAVEn mean? Robert Lejnert HP Autonomy Much higher Volumes. Processed with more Velocity. With much more Variety. Is Big Data so big? Big Data Smart Data Project HAVEn: Adaptive Intelligence
ATA DRIVEN GLOBAL VISION CLOUD PLATFORM STRATEG N POWERFUL RELEVANT PERFORMANCE SOLUTION CLO IRTUAL BIG DATA SOLUTION ROI FLEXIBLE DATA DRIVEN V WHITE PAPER Create the Data Center of the Future Accelerate
Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data Yinong Chen 2 Big Data Big Data Technologies Cloud Computing Service and Web-Based Computing Applications Industry Control Systems
SOCIAL MEDIA LISTENING AND ANALYSIS Spring 2014 Our Understanding The rise of social media has transformed the way citizens engage with their government. Each day, nearly 2 billion people talk about and
The Challenges of Geospatial Analytics in the Era of Big Data Dr Noordin Ahmad National Space Agency of Malaysia (ANGKASA) CITA 2015: 4-5 August 2015 Kuching, Sarawak Big datais an all-encompassing term
BIG DATA - How big data transforms our world Kim Escherich Executive Innovation Architect, IBM Global Business Services 1 2 What happens? What is data? 340.282.366.920.938.463.463.374.607.431.768.211.456
BIG DATA : Big Opportunity or Big Threat for Official Statistics?* Jose Ramon G. Albert, Ph.D. Secretary General, NSCB Email: email@example.com 1 *Views expressed do not reflect those at NSCB Outline
Is Big Data a Big Deal? What Big Data Does to Science Netherlands escience Center Wilco Hazeleger Wilco Hazeleger Student @ Wageningen University and Reading University Meteorology PhD @ Utrecht University,
2013 BIG DATA OPPORTUNITIES SURVEY By Joseph McKendrick, Research Analyst Produced by Unisphere Research, a Division of Information Today, Inc. May 2013 Sponsored by 2 TABLE OF CONTENTS Executive Summary.............................................................3
beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed
Management Information and big data in Insurance New drivers to create business opportunities Fabrice Ciais DUBLIN 24 th April 2013 2013 Towers Watson. All rights reserved. CONTENTS Contents Introductions
Annex: Concept Note Friday Seminar on Emerging Issues Big Data for Policy, Development and Official Statistics New York, 22 February 2013 How is Big Data different from just very large databases? 1 Traditionally,
W3C USING OPEN DATA WORKSHOP VISUAL ANALYTICS FOR POLICY-MAKING OPPORTUNITIES AND RESEARCH CHALLENGES Francesco Mureddu, Tech4i2.com www.crossover-project.eu #pmod BACKGROUND CROSSOVER project Bridging
CRM as a Service For Customers in the Cloud Customer Relationship Management Our mission: to help our customer identify, define, design and deliver the best CRM strategy, in terms of For our Customer with
Introduction Enterprise Data Hub Accelerator Retail Sector Use Cases Capabilities Information-Driven Transformation in Retail with the Enterprise Data Hub Accelerator Introduction Enterprise Data Hub Accelerator
CORRALLING THE WILD, WILD WEST OF SOCIAL MEDIA INTELLIGENCE Michael Diederich, Microsoft CMG Research & Insights Introduction The rise of social media platforms like Facebook and Twitter has created new
Big Data Analytics and its Impact on Supply Chain Management Dr. Mahesh Ramamani Management Consultant, Business Analytics & Strategy IBM India July 2015, CII Conference on Redefining Supply Chain in the
BIG DATA AND ANALYTICS From Sensory Overload to Predictable Outcomes THE BIG DATA CHALLENGE OR OPPORTUNITY Companies have long focused on how to better serve their customers and increase profitability.
Data Science Master s programs 1 1 Who are we? Willem-Jan van den Heuvel Tilburg University Ksenia Podoynitsyna Eindhoven University of Technology 2 2 Program What is Data Science? The Data Science Initiative
BIG DATA: BIG BOOST TO BIG TECH Ms. Tosha Joshi Department of Computer Applications, Christ College, Rajkot, Gujarat (India) ABSTRACT Data formation is occurring at a record rate. A staggering 2.9 billion
Top 10 Predictive Use Cases and Customer Case Studies Confidently anticipate and drive better business outcomes Pierre Leroux, Director Predictive Analytics 2015 SAP SE or an SAP affiliate company. All
Human mobility and displacement tracking The importance of collective efforts to efficiently and ethically collect, analyse and disseminate information on the dynamics of human mobility in crises Mobility
IBM Software Hadoop in the cloud Leverage big data analytics easily and cost-effectively with IBM InfoSphere 1 2 3 4 5 Introduction Cloud and analytics: The new growth engine Enhancing Hadoop in the cloud
M A R K E T S P O T L I G H T T r a n s f o r m i ng Manufacturing w ith the I n t e r n e t o f Things May 2015 Adapted from Perspective: The Internet of Things Gains Momentum in Manufacturing in 2015,
Data Mining with Qualitative and Quantitative Data John F. Elder IV, Ph.D. CEO, Elder Research, IIA Faculty S e p t e m b e r, 2010 www.iianalytics.com www.iianalytics.com John F. Elder IV, PhD Elder Research,
Leveraging Geospatial Technologies EO Data by using SAP HANA Spatial Hinnerk Gildhoff, Head of HANA Spatial, SAP Satellite Masters Conference 21 th October 2015 Public Disclaimer This presentation outlines
ENABLING REAL-TIME BUSINESS WITH SAP HANA IN THE CLOUD Mark Lenke Global Managing Partner Big Data & Analytics, CSC Lynn McLean VP, Unified Compute Platform (UCP) Sales, HDS 2014 CSC and HDS. All Rights
Cognizant 20-20 Insights Five Steps for Succeeding with Social Media and Delivering an Enhanced Customer Experience Executive Summary Social CRM places the customer at the heart of the company, where customers
Paper 7300-2016 Graphing Made Easy for Project Management Zhouming (Victor) Sun, AZ/Medimmune Corp., Gaithersburg, MD ABSTRACT Project management is a hot topic across many industries, and there are multiple
Using Big Data Analytics to Improve Government Performance Arun Chandrasekaran Gartner is a registered trademark of Gartner, Inc. or its affiliates. This publication may not be reproduced or distributed
ORACLE SOCIAL ENGAGEMENT AND MONITORING CLOUD SERVICE KEY FEATURES Global social media, web, and news feed data Market-leading listening quality Automatic categorization Configurable dashboards, drill-down
89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com @EdisonGroupInc 212.367.7400 Solution Brief Bridging the Infrastructure Gap for the Internet of Things with Object Storage Printed in the
Competing on Business Analytics The New Age of Smart Decision Making Executive Summary June 3, 2009 columbus, ohio hosted by in collaboration with Competing on Business Analytics Executive Dinner Presented
Distr. GENERAL Working Paper 11 April 2013 ENGLISH ONLY UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE (ECE) CONFERENCE OF EUROPEAN STATISTICIANS ORGANISATION FOR ECONOMIC COOPERATION AND DEVELOPMENT (OECD)
NoSQL for SQL Professionals William McKnight Session Code BD03 About your Speaker, William McKnight President, McKnight Consulting Group Frequent keynote speaker and trainer internationally Consulted to
Software: Driving Innovation for Engineered Products Software in products holds the key to innovations that improve quality, safety, and ease-of-use, as well as add new functions. Software simply makes
IBM Software White Paper Consumer Products Delivering new insights and value to consumer products companies through big data 2 Delivering new insights and value to consumer products companies through big
Beyond Watson: The Business Implications of Big Data Shankar Venkataraman IBM Program Director, STSM, Big Data August 10, 2011 The World is Changing and Becoming More INSTRUMENTED INTERCONNECTED INTELLIGENT
Big Data Analysis John Domingue (STI International and The Open University) Project co-funded by the European Commission within the 7th Framework Program (Grant Agreement No. 257943) 1 The Data landscape
The Rise of Industrial Big Data Brian Courtney General Manager Industrial Data Intelligence Agenda Introduction Big Data for the industrial sector Case in point: Big data saves millions at GE Energy Seeking
Big Data a threat or a chance? Helwig Hauser University of Bergen, Dept. of Informatics Big Data What is Big Data? well, lots of data, right? we come back to this in a moment. certainly, a buzz-word but
The Six Critical Considerations of Social Media Threat Intelligence Every day, angry rhetoric and hints of potential danger flow though streams of social media data. Some of these threats may affect your
Whitepaper Position Your Business for Big Data Success: 5 Steps to Create Sustained Value to the Bottom Line Understanding the Problem Businesses are facing an environment where more and more data is collected
An interdisciplinary model for analytics education Raffaella Settimi, PhD School of Computing, DePaul University Drew Conway s Data Science Venn Diagram http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
Big Data Analytics 1 Priority Discussion Topics What are the most compelling business drivers behind big data analytics? Do you have or expect to have data scientists on your staff, and what will be their
October 7, 2014 MES and Industrial Internet Jan Snoeij Board Member, MESA International Principal Consultant, CGI Do you know MESA? Agenda Introduction Internet of Things Big Data Smart Factory or Smart
SOCIAL MEDIA MANAGEMENT: SOCIAL MEDIA ISSUES VS SOCIAL MEDIA CRISIS Once you have taken the first steps in developing your social media presence, it is time to create a social media strategy. Similar to
Jérôme Berthier, Head of BI & Big Data Lausanne, 07.05.2015 OUTLINE 1. ELCA in few words 2. Introduction 3. Some proofs 4. Conclusion ELCA Presentation ELCA is one of the largest information technology
PREDICTIVE SOCIAL MARKETING MEDIA Digital Marketing to the Digital Self: Winning with Social & Predictive Marketing Matt Langie Adobe Matt Langie @mattlangie The Four Challenges to Measuring ROI from Social
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Siemens Future Forum @ HANNOVER MESSE 2014 From Big to Smart Hannover Messe 2014 The Evolution of Big Digital data ~ 1960 warehousing ~1986 ~1993 Big data analytics Mining ~2015 Stream processing Digital