Managing Data Quality in OpenStreetMap



Similar documents
AN ONTOLOGY BASED APPROACH FOR GEOSPATIAL DATA INTEGRATION OF AUTHORITATIVE AND CROWD SOURCED DATASETS

OSM-CAT: A Java tool for OpenStreetMap Contributor Analysis. Peter Smith and Peter Mooney

How good is Volunteered Geographical Information? A comparative study of OpenStreetMap and Ordnance Survey datasets

The process of mapping the Earth

GO_SYNC - A FRAMEWORK TO SYNCHRONIZE CROWD-SOURCED MAPPING CONTRIBUTIONS FROM ONLINE COMMUNITIES AND TRANSIT AGENCY BUS STOP INVENTORIES

M GIS. A Short Introduction to Volunteered Geographic Information Presentation of the OpenStreetMap Project

Overview. What are operational policies? Development, adoption, implementation

The Role of Social Networks in Emergency Management: A Research Agenda

OSGeo Web Mapping Typification: GeoMajas, Mapbender, MapFish and OpenLayers. Christoph Baudson Arnulf Christl FOSS4G 2010 Barcelona

EXPLORING SPATIAL PATTERNS IN YOUR DATA

What Should a GIS Librarian Do?

OSM GB. Introduction. Users Requirements. Abstract OSM GB

CAPITAL REGION GIS SPATIAL DATA DEMONSTRATION PROJECT

OSMatrix Grid-based Analysis and Visualization of OpenStreetMap

Data Visualization Techniques and Practices Introduction to GIS Technology

<.bloomberg> gtld Registration Policies

County of Los Angeles. Chief Information Office Preferred Technologies for Geographic Information Systems (GIS) September 2014

VGI in Education from K-12 to Graduate Studies

KERKERING BARBERIO & CO., P.A. CLIENT HOSTING AGREEMENT

IAAA Grupo de Sistemas de Información Avanzados

WIPO Summer Schools on Intellectual Property

IBM's Fraud and Abuse, Analytics and Management Solution

A characterization of Volunteered Geographic Information

GEOG 482/582 : GIS Data Management. Lesson 10: Enterprise GIS Data Management Strategies GEOG 482/582 / My Course / University of Washington

MultiREBAR CAD.

Quality Assessment of Volunteered Geographic Information Using Open Web Map Services Within OpenAddresses

Leveraging Geospatial Information Across the Enterprise: Bentley s Flexible Geospatial Approach

Enabling Semantic Search in Geospatial Metadata Catalogue to Support Polar Sciences

Big Data Analytics. Bringing Value out of Volume

Administrative Manual

ATTENTION: This legal notice applies to the entire contents of this website under the domain name

Internet Reputation Management Guidelines Building a Roadmap for Continued Success

Leveraging Geospatial Information Across the Enterprise: Bentley s Flexible Geospatial Approach

Copyright 2013 wolfssl Inc. All rights reserved. 2

Privacy Policy and Terms of Use

Visualize your World. Democratization i of Geographic Data

REACCH PNA Data Management Plan

On-campus and fully online distance learning Full-time and Part-time. PgCert/PgDip/MSc Geographic Information Systems (GIS)

Can social media data be useful in spatial modelling? A case study of museum Tweets and visitor flows R. Lovelace, N. Malleson, K. Harland, M.

Unleashing city data CERC and Algebra are prototyping QCumber Smart

Spatial Data Analysis Using GeoDa. Workshop Goals

Pallas Ludens. We inject human intelligence precisely where automation fails. Jonas Andrulis, Daniel Kondermann

IBM Maximo Asset Management solutions for the oil and gas industry

Internet Reputation Management Guide. Building a Roadmap for Continued Success

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

GEM global earthquake model. User guide: Android mobile tool for field data collection. Rosser, J., J.G. Morley, A. Vicini. Data capture tools

The Professional's Training Course to SEO as seen in...

Mobile Banking Service Agreement (Addendum to your Primary Online Banking Service Agreement)

Cyber Resilience Implementing the Right Strategy. Grant Brown Security specialist,

A Capability Model for Business Analytics: Part 2 Assessing Analytic Capabilities

Before The United States House of Representatives Committee On The Judiciary. Subcommittee on Intellectual Property, Competition and the Internet

WEB 2.0 AND SECURITY

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model

Policy Overview and Definitions

OpenStreetMap for the Web

GreenPeak White Paper Wireless Communication Standards for the Internet of Things

WEB HOSTING SERVICES. 2. Fees and Payment Terms.

SUMMER SCHOOL ON ADVANCES IN GIS

Service Schedule for Business Lite powered by Microsoft Office 365

Health Care Informatics. Field of health informatics is > 40 years. Now it is important component of the overall practice of medicine.

Spatial Data Warehouse SDW/ADP Stakeholder Consultations

Fraud and Abuse Policy

HKUST-MIT Research Alliance Consortium. Call for Proposal. Lead Universities. Participating Universities

Map-Reduce for Machine Learning on Multicore

LNG Terminals. File Geodatabase Feature Class. Tags natural gas, lng, liquid natural gas, terminals

Machine Learning Capacity and Performance Analysis and R

Secure Because Math: Understanding ML- based Security Products (#SecureBecauseMath)

LiveOn Web Conference System Service Terms of Use

Fortune 500 Medical Devices Company Addresses Unique Device Identification

CASE STUDY NEDERLANDSE ENERGIE MAATSCHAPPIJ (NLE) Energy for growth. How online self-service is allowing NLE to dominate the competition

Transcription:

Managing Data Quality in OpenStreetMap TOOLS FOR AN ACTIVE MAPPING COMMUNITY NC GIS CONFERENCE 2013 This document licensed in entirety by Creative Commons CC-by-SA. For specific terms of license, see: http://creativecommons.org/licenses/by-sa/3.0/

Overview The Short History of the OpenStreetMap Revolution 2 Assessing Open Source Data Quality Overview of Tools Creating Tools that Matter

Overview: Key Questions How can crowd-sourced projects manage data quality effectively? 3 What tools exist for monitoring data quality in OpenStreetMap? What conclusions can be drawn about existing tools? What is the future of data quality in crowd-sourced projects?

OpenStreetMap is 4 A freely-editable map of the world unconstrained by proprietary ownership Wikipedia for maps

The Origins of OpenStreetMap 5 OpenStreetMap.org domain registered by Steve Coast in 2004 Project originated in the United Kingdom, where Crown copyright on geospatial data Little, or no public domain data Simple goal to create a free, publicly-available database of street centerlines

OpenStreetMap is 6 A freely-editable map of the world unconstrained by proprietary ownership Wikipedia for maps

Looks like a wiki 7

Wiki-based Documentation! 8

Milestones in OpenStreetMap History 2004 - OpenStreetMap.org registered by Steve Coast 2005 Map Limehouse, 1st OpenStreetMap mapping party 2005 1000 registered OpenStreetMap users 2006 OpenStreetMap Foundation established 2007 5 million ways in OSM database 2007 10,000 registered OpenStreetMap users 2008 - TIGER data import for the US completed 2009-100,000 registered OpenStreetMap users 2010-200,000 registered OpenStreetMap users 2012 ~670,000 registered OpenStreetMap users 9

OpenStreetMap User Growth One million registered users worldwide! 10

OpenStreetMap Growth in User Edits 11

OpenStreetMap Database Growth 12

Data Quality in Crowd-sourced Projects Goodchild & Li: Identified three mechanisms for Quality Assurance 13 Crowd-sourcing Social Geographic Goodchild, Michael F., and Linna Li. "Assuring the quality of volunteered geographic information." Spatial Statistics 1 (2012): 110-120.

Crowd-sourced Approach to Data Quality Based on Surowiecki s Wisdom of the Crowd Multiple users converge around consensus solutions that might escape an individual Many independent observations reinforce the validity of a single observation Concurrence on observed features (e.g. It s a bridge. ) Convergence on the truth 14 The group validates observations & corrects errors Surowiecki, J., 2005. The Wisdom of Crowds. Anchor, New York.

Social Approach to Data Quality Through practices, users acquire reputations Users with good reputations are trusted Trust and reputation are indicators of stewardship As the project evolves, social leadership becomes more formalized. 15 The Data Working Group of OpenStreetMap fullfills this function Email lists supplement social stewardship

Geographic Tools for Data Quality Geographic approach draws on formal geographic theory: Spatial neighbors & auto-correlation (Moran statistics) Christaller s Central Place Theory Descriptive Statistics Inferential Statistics & Analysis of Variance (ANOVA) Richardson plots of linear measurements Cluster analysis, e.g. k-means These approaches have not been widely adopted for use in the OpenStreetMap project yet 16

A Quick Survey of Data Quality Tools Two types of tools are in widespread use: 17 Error Detection Tools Monitoring Tools

Error Detection Tools: Keep Right 18

Error Detection Tools: Map Dust 19

Error Detection Tools: OpenStreetBugs

Error Detection Tools: No Name 21

Error Detection Tools: MapRoulette 22

Monitoring Tools 23

Monitoring Tools: OpenStreetMap Watch List (OWL) 24

Monitoring Tools: GeoFabrik Map Compare 25

Monitoring Tools: Who Did It 26

Monitoring Tools: ITO TIGER Reviewed 27

Monitoring Tools: ITO TIGER Reviewed 28

Monitoring Tools: Green Means Go 29

Monitoring Tools: Who s Around Me 30

Social Controls OpenStreetMap - Data Working Group (DWG) Resolving disputes between users Processes & protocols for data imports Investigates copyright infringement Deals with issues of vandalism and fraud Suspends or closes user accounts (in case of abuse) IP blocking (in case of abuse) 31

How do Social Methods Treat Vandalism? OpenStreetMap is not immune from malicious intent Copyright infringement (e.g. copying from Google Maps) Graffiti Disputes & Edit Wars (e.g. Kashmir region, Palestine) Spam Tools for Managing Vandalism Detect using daily diffs UserActivity batch comparison of two versions of the database Revert undo changeset to previous version Virtual Ban 32

Summary Review Three methods for data quality control Crowd-sourced Social Geographic OpenStreetMap has crowd-sourced and social tools for managing data quality Error & Monitoring tools Data Working Group - Social Geographic methods are experimental at this time Increasingly complete geographic features will lead to better tools 33

Lessons Learned about OSM Data Quality Successive editing by multiple users can improve accuracy up to a point Haklay suggests that few improvements are made beyond the 13 th edit Semantic differences are not easy to resolve Tag wars Obscure edits do not always get corrected if there are no local mappers that take ownership Social approaches will acquire more authority Are part-time, volunteer staffers enough to guarantee data quality? What are appropriate metrics for trust and reputation? Haklay, M. 2010. How Good is volunteered geographical information? a comparative study of OpenStreetMap and Ordnance Survey Datasets. Environment & Planning B: Planning and Design 37 (4), 682-703g 34

Thank You 35 Questions? Steven Johnson (e) stevejohnson@deloitte.com (t) @geomantic This document licensed in entirety by Creative Commons CC-by-SA. For specific terms of license, see: http://creativecommons.org/licenses/by-sa/3.0/