USE OF GEOSPATIAL AND WEB DATA FOR OECD STATISTICS

Similar documents
Education at a Glance. OECD Indicators. Annex: UOE Data Collection Sources

TOWARDS PUBLIC PROCUREMENT KEY PERFORMANCE INDICATORS. Paulo Magina Public Sector Integrity Division

relating to household s disposable income. A Gini Coefficient of zero indicates

Expenditure and Outputs in the Irish Health System: A Cross Country Comparison

SunGard Best Practice Guide

Delegation in human resource management

HSE HR Circular 005/ th February 2010.

How many students study abroad and where do they go?

Guide. Axis Webinar User Guide

ORGANISATION FOR ECONOMIC CO-OPERATION AND DEVELOPMENT

Preventing fraud and corruption in public procurement

Group 1 Group 2 Group 3 Group 4

Guide. Axis Webinar. User guide

With data up to: May Monthly Electricity Statistics

Office Rents map EUROPE, MIDDLE EAST AND AFRICA. Accelerating success.

41 T Korea, Rep T Netherlands T Japan E Bulgaria T Argentina T Czech Republic T Greece 50.

Electricity, Gas and Water: The European Market Report 2014

PERMANENT AND TEMPORARY WORKERS

The U.S Health Care Paradox: How Spending More is Getting Us Less

Compared assessment of selected environmental indicators of photovoltaic electricity in OECD cities

Internationalization and higher education policy: Recent developments in Finland

Higher education institutions as places to integrate individual lifelong learning strategies

Foreign Taxes Paid and Foreign Source Income INTECH Global Income Managed Volatility Fund

Judicial performance and its determinants: a cross-country perspective

Waiting times and other barriers to health care access

Global Network Access International Access Rates

Insurance corporations and pension funds in OECD countries

Indicator Fact Sheet (WQ05) Water prices Author: Concha Lallana, CEDEX EEA project manager: Niels Thyssen version

SF3.1: Marriage and divorce rates

T R A V E L A N D A C C O M M O D A T I O N E X P E N S E S

Health Care Systems: Efficiency and Policy Settings

TREATY MAKING - EXPRESSION OF CONSENT BY STATES TO BE BOUND BY A TREATY

PUBLIC VS. PRIVATE HEALTH CARE IN CANADA. Norma Kozhaya, Ph.D Economist, Montreal economic Institute CPBI, Winnipeg June 15, 2007

Reporting practices for domestic and total debt securities

Trends in Digitally-Enabled Trade in Services. by Maria Borga and Jennifer Koncz-Bruner

Size and Development of the Shadow Economy of 31 European and 5 other OECD Countries from 2003 to 2015: Different Developments

COMMUNICATION FROM THE COMMISSION

A Comparison of the Tax Burden on Labor in the OECD

A Comparison of the Tax Burden on Labor in the OECD By Kyle Pomerleau

PISA FOR SCHOOLS. How is my school comparing internationally? Andreas Schleicher Director for Education and Skills OECD. Madrid, September 22 nd


What Proportion of National Wealth Is Spent on Education?

1. Perception of the Bancruptcy System Perception of In-court Reorganisation... 4

Gross Domestic Product (GDP-PPP) Estimates for Metropolitan Regions in Western Europe, North America, Japan and Australasia

Improving the quality and flexibility of data collection from financial institutions

International comparisons of obesity prevalence

PUBLIC & PRIVATE HEALTH CARE IN CANADA

EAEVE Establishments Status

World Consumer Income and Expenditure Patterns

Professor Jane Fountain conferences

THE LOW INTEREST RATE ENVIRONMENT AND ITS IMPACT ON INSURANCE MARKETS. Mamiko Yokoi-Arai

UTX Europe V2 - Enhancements

Brochure More information from

Hong Kong s Health Spending 1989 to 2033

PRIORITY RULES ON COMPENSATION FOR NUCLEAR DAMAGE IN NATIONAL LEGISLATION

Canada GO 2535 TM World Traveller's edition Maps of North America (Canada, US, Mexico), Western and Central Europe (including Russia) CAD 349,95

Supported Payment Methods

INEQUALITIES IN HEALTH CARE SERVICES UTILISATION IN OECD COUNTRIES

INTERNATIONAL COMPARISONS OF PART-TIME WORK

ORGANISATION FOR ECONOMIC CO-OPERATION AND DEVELOPMENT

Supported Payment Methods

How To Calculate Tertiary Type A Graduation Rate

Please join us on the next INCOSE Webinar. When. Topic. Speaker

Early Childhood Education and Care

Accuracy counts! SENSORS WITH ANALOG OUTPUT

What Is the Total Public Spending on Education?

MANDATORY PROVIDENT FUND SCHEMES AUTHORITY

Appendix C. National Subscription Television Regulations

Funding and network opportunities for cluster internationalization

GfK PURCHASING POWER INTERNATIONAL

Health Care a Public or Private Good?

32 nd National Conference on Law & Higher Education

SURVEY OF INVESTMENT REGULATION OF PENSION FUNDS. OECD Secretariat

EMEA Rents map RETAIL Accelerating success.

EUMETSAT Satellite Programmes

Encouraging Quality in Early Childhood Education and Care (ECEC)

Crime and Criminal Justice

Architecture. Young Talent. Award Rules. January Organised by:

Current Trends & Analysis of ACBS. Completion of Accredited Courses & Recognition of Prior Learning in ACBS

Where science & ethics meet. The EFGCP Report on The Procedure for the Ethical Review of Protocols for Clinical Research Projects in Europe and Beyond

Mapping Global Value Chains

Review of R&D Tax Credit. Invitation for Submissions

ANNUAL HOURS WORKED. Belgium:

MAPPING THE IMPLEMENTATION OF POLICY FOR INCLUSIVE EDUCATION

STATISTICS FOR THE FURNITURE INDUSTRY AND TRADE

187/ December EU28, euro area and United States GDP growth rates % change over the previous quarter

Appendix 1: Full Country Rankings

CO1.2: Life expectancy at birth

Health Care in Crisis

INTERNATIONAL COMPARISONS OF HOURLY COMPENSATION COSTS

Culture Change in the Workforce in an Aging America: Are We Making Any Progress?

Cross-country comparison of health care system efficiency

THINK Global: Risk and return

NUCLEAR OPERATOR LIABILITY AMOUNTS & FINANCIAL SECURITY LIMITS

Transcription:

USE OF GEOSPATIAL AND WEB DATA FOR OECD STATISTICS CCSA SPECIAL SESSION ON SHOWCASING BIG DATA 1 OCTOBER 2015 Paul Schreyer Deputy-Director, Statistics Directorate, OECD

OECD APPROACH

OECD: Facilitator of discussion on new data sources for NSOs OECD s own use of new data sources From Big Data to Smart Data Not every New data source is Big Not every Big data source is New

Business value analysis: why are we working on this? More granularity or coverage of existing data (e.g. spatial disaggregation) New output (e.g., measuring trust, inequalities) Greater timeliness nowcasting Increased impact analysis supporting OECD mission, possibility to link areas Increased responsiveness capacity to address new topics quickly, respond to what-if questions

Business process analysis: Necessary capabilities Capacity to identify, evaluate and access new data sources Command of methodology Proven quality and metadata frameworks Suitable IT infrastructures Established legal and ethical frameworks Skills and training capacity

4 types of new sources and examples of use cases Web crawling, web scraping Content Analysis Mobility studies Sensor and geospatial data * Online Real estate prices (OECD GOV) * Measuring trade restrictiveness by scraping and analysing trade laws (OECD TAD) * African Economic Outlook (AEO): Civil tensions and political governance indicators (OECD DEV) * Big Data Measures of Human Well-Being Evidence from US Google Index (OECD STD) * Measure transport reliability from geolocalisation logs (ITF) * Air quality and land cover data (OECD GOV) * Enriching the metropolitan database using geo-spatial data (OECD GOV) * PIAAC log file data (OECD EDU)

EXAMPLE 1 ENVIRONMENTAL INDICATORS Using geospatial data (satellite data)

Average population exposure to air pollution (PM2.5) Key messages that the indicator should communicate Where air pollution is above recommended levels Where improvements in air quality have happened Linking air pollution to health

Source: Raster (satellite observations) Satellite observations Raster: van Donkelaar et al. (2014) Resolution: ~10 km2 Years: 1998-2012 Ground-based stations Advantages Direct measures Offer regular levels of air pollution over time More pollutants are available Disadvantages Low coverage in developing countries Uneven coverage within and across countries PM 2.5 concentration rarely monitored Site selection, measurement techniques, and reporting methods differ across regions and countries Satellite observations Global coverage Consistent method to compute air pollution in cities, regions and countries Consistent time-series data, spanning more than a decade Modelled data Satellite observations are less precise for bright surfaces (snow or desert) Current data are on a multi-year average, evaluation of short-term events often unavailable 9

Basic methodology 1. The satellite-based values of air pollution are multiplied by the population living in the area (using a 1km2 resolution grid) 2. The exposure to air pollution in a region is given by the sum of the population weighted values of PM2.5 in the 1km2 grid cells falling within the boundaries of the region 3. Finally, dividing this aggregated value by the total population in the region, we obtain the average exposure to PM2.5 concentration in a region

Levels and trends in OECD cities 68% of the urban population in OECD countries (376 million people) are exposed to pollution above the WHO s recommended levels. OECD estimates show wide variation in PM 2.5 exposure levels across cities within countries, the largest in Mexico, Italy, Japan and Korea Metropolitan minimum Country average Metropolitan maximum Country (No. of cities) -10 0 10 20 30 40 Cuernavaca Milan Kumamoto Cheongju Strasbourg Buffalo Kraków Zaragoza Essen Malmö Liverpool Mérida Palermo Naha Ulsan Toulon Portland Gdańsk Las Palmas Bremen Stockholm Glasgow Mexico (33) Italy (11) Japan (36) Korea (10) France (15) United States (70) Poland (8) Spain (8) Germany (24) Sweden (3) United Kingdom (15) Source: Brezzi and Sanchez-Serra (2014) Ostrava Brno Czech Republic (3) Santiago Concepción Chile (3) Zurich Geneva Switzerland (3) Toronto Quebec Canada (9) The Hague Utrecht Netherlands (5) Porto Lisbon Portugal (2) Thessalonica Athens Greece (2) Brussel Antwerp Belgium (4) Vienna Linz Austria (3) Budapest Hungary (1) Bratislava Slovak Republic (1) Ljubljana Slovenia (1) Copenhaguen Denmark (1) Helsinki Finland (1) Tallinn Estonia (1) Oslo Norway (1) Dublin Ireland (1) 11

Other example: raster sources used for land cover Europe USA Japan World Raster name Corine land cover National land cover dataset (NLCD) Japan National Land Service Information data MODIS 500 Map of Global Urban Extent Resolution 25 metres 30 metres 100 metres 500m Years 2000-06 2001-06 1997-2006 2008 Classif. of urban land 44 land urban classes 21 land cover classes 11 land cover classes 17 land cover classes Water

feeds into the OECD Regional Well-Being Database Links: Regional Well-Being database Regional Well-Being web tool

EXAMPLE 2 TRADE POLICY ANALYSIS Using qualitative data from government websites

Basic idea Traditionally: Policy questionnaires to countries Manual screening of government websites New: Machine-based monitoring of government web sites Automatic check for changes or addition of rules and regulations Test case: qualitative information for the OECD s trade restrictiveness information and index

How? Text comparison - Initial discovery Run a text comparison between the original document and the new updated document Detect and flag specific paragraphs changed or updated inside long documents Text comparison - Advanced discovery. Changes in rules and regulations can also happen through new pages Use big data techniques to compare in house structured information to the universe of laws and regulations in a given country. Work on text definitions similar to the original ones to help identifying potentially relevant documents.

IT Tools Web-crawling: scripts to systematically scan governmental websites where regulations can be found (federal, provincial, regional, etc.). Web-scraping: scripts to extract the relevant information in documents, possibly based on articles and paragraphs (text analysis). Document conversion: most laws and regulations are in pdf but possibly in other formats that would need to become text documents to run text analysis. Text comparison: tools and dictionaries to compare the text of updated documents with the original text, to calculate similarity coefficients with other documents, in a variety of languages with the option to also use proximity of similar words.

Web scraping / Text analysis Promising results on French legal texts (Legifrance)

Summary Significant potential Use cases and pilots provide really important reality checks Smart data and multiple source, not necessarily big data Initiatives have sprung in many parts of OECD Need to be accompanied by overall strategy being developed at OECD

Thank you!