Data Visualization in Official Statistics



Similar documents
Visualization and Big Data in Official Statistics

Big Data. Case studies in Official Statistics. Martijn Tennekes. Special thanks to Piet Daas, Marco Puts, May Offermans, Alex Priem, Edwin de Jonge

Big data, the future of statistics

TOP-DOWN DATA ANALYSIS WITH TREEMAPS

Data Visualisation and Its Application in Official Statistics. Olivia Or Census and Statistics Department, Hong Kong, China

Visualizing and Inspecting Large Datasets with Tableplots

Big Data andofficial Statistics Experiences at Statistics Netherlands

Big Data and Official Statistics

What is GIS? Geographic Information Systems. Introduction to ArcGIS. GIS Maps Contain Layers. What Can You Do With GIS? Layers Can Contain Features

DKAN. Data Warehousing, Visualization, and Mapping

Modifying Colors and Symbols in ArcMap

R Graphics Cookbook. Chang O'REILLY. Winston. Tokyo. Beijing Cambridge. Farnham Koln Sebastopol

Miracle Integrating Knowledge Management and Business Intelligence

Visualization methods for patent data

How To Use Gis

What's new in gvsig Desktop 2.0

Big Data as a Data Source for Official Statistics: experiences at Statistics Netherlands

Government 98dn Mapping Social and Environmental Space

NIELSEN SEGMENTATION & MARKET SOLUTIONS

Big Data and Semantic Web in Manufacturing. Nitesh Khilwani, PhD Chief Engineer, Samsung Research Institute Noida, India

MetroBoston DataCommon Training

Sisense. Product Highlights.

Quick and Easy Web Maps with Google Fusion Tables. SCO Technical Paper

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel

2.0 COMMON FORMS OF DATA VISUALIZATION

What is GIS. What is GIS? University of Tsukuba. What do you image of GIS? Copyright(C) ESRI Japan Corporation. All rights reserved.

SAS REPORTS ON YOUR FINGERTIPS? SAS BI IS THE ANSWER FOR CREATING IMMERSIVE MOBILE REPORTS

Big Data (and official statistics) *

Government 1008: Introduction to Geographic Information Systems. LAB EXERCISE 4: Got Database?

Cloud-based Geospatial Data services and analysis

Package treemap. February 15, 2013

GETTING STARTED WITH R AND DATA ANALYSIS

How To Use A Polyanalyst

A GIS helps you answer questions and solve problems by looking at your data in a way that is quickly understood and easily shared.

Bibliometric Tools. Adam Finch Analyst, CSIRO CSIRO SCIENCE EXCELLENCE

NEW TECHNOLOGIES TO IMPROVE SALES MANAGEMENT FOR ROMANIAN RETAIL COMPANIES. Ruxandra Badea, PhD Candidate. Vlad Voicu, PhD Candidate

How To Choose A Business Intelligence Toolkit

Visualization of LODES/OnTheMap Work Destination Data Using GIS and Statistical applications

INTRODUCTION TO DATA SCIENCE USING R

Review of Best Practice in Road Crash Database and Analysis System Design Blair Turner ARRB Group Ltd

Examples of Spotfire Recommendations in Action

Big CBS. Experiences at Statistics Netherlands. Dr. Piet J.H. Daas Methodologist, Big Data research coördinator. Statistics Netherlands

Making a Choropleth map with Google Fusion Tables

The Spatial Analysis and Visualization Initiative (SAVI) CERTIFICATE PROGRAM

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

University of Arkansas Libraries ArcGIS Desktop Tutorial. Section 2: Manipulating Display Parameters in ArcMap. Symbolizing Features and Rasters:

Tutorial 3 - Map Symbology in ArcGIS

The Use of Geographic Information Systems in Risk Assessment


GEOGRAPHIC INFORMATION SYSTEMS

TIBCO Spotfire Business Author Essentials Quick Reference Guide. Table of contents:

University of Arkansas Libraries ArcGIS Desktop Tutorial. Section 4: Preparing Data for Analysis

North Highland Data and Analytics. Data Governance Considerations for Big Data Analytics

White Paper. Thirsting for Insight? Quench It With 5 Data Management for Analytics Best Practices.

Crime Hotspots Analysis in South Korea: A User-Oriented Approach

OGC at KNMI: Current use and plans Available products

Client Overview. Engagement Situation. Key Requirements

Monitoring chemical processes for early fault detection using multivariate data analysis methods

Module 2: Introduction to Quantitative Data Analysis

EcOS (Economic Outlook Suite)

INTRODUCTION to ESRI ARCGIS For Visualization, CPSC 178

SAS Visual Analytics dashboard for pollution analysis

DATA VISUALIZATION GABRIEL PARODI STUDY MATERIAL: PRINCIPLES OF GEOGRAPHIC INFORMATION SYSTEMS AN INTRODUCTORY TEXTBOOK CHAPTER 7

Visualization of Semantic Windows with SciDB Integration

BIG DATA FOR MODELLING 2.0

CHAPTER 2 Land Use and Transportation

SAS VISUAL ANALYTICS AN OVERVIEW OF POWERFUL DISCOVERY, ANALYSIS AND REPORTING

IFS-8000 V2.0 INFORMATION FUSION SYSTEM

Dynamic accessibility analysis using big data

Public Sector Solutions

Exercise 1: How to Record and Present Your Data Graphically Using Excel Dr. Chris Paradise, edited by Steven J. Price

Visualization with Excel Tools and Microsoft Azure

Flexible Web Visualization for Alert-Based Network Security Analytics

Mine Water Truck Tracking Rio Tinto Kennecott Copper TERESA COCKAYNE ENVIRONMENTAL, LAND, & WATER RTKC JULY 23, 2015

HELCOM Data and Map Service. User Manual

Realist 2.0 MLS Support (512) Monday thru Friday 9:00 am 5:00 pm

GEOGRAPHIC INFORMATION SYSTEMS CERTIFICATION

Professional Services Market Study Market assessment of the Dutch professional services industry -

How To Use Statgraphics Centurion Xvii (Version 17) On A Computer Or A Computer (For Free)

Easily Identify Your Best Customers

Contents. Specific and total risk. Definition of risk. How to express risk? Multi-hazard Risk Assessment. Risk types

Demographics of Atlanta, Georgia:

ALAMA: App Logs as Measurement Approximation

TDAQ Analytics Dashboard

Transcription:

Data Visualization in Official Statistics Martijn Tennekes Jan van der Laan, Edwin de Jonge, Jessica Solcer, Alex Priem

Statistics Netherlands / CBS - Creates and publishes official statistics on economics, demographics, health care and others. - Since 1899 - Website: www.cbs.nl 2

Types of data 1. Survey data = data collected by CBS with questionnaires 2. Admin data = administrative (register) data collected by third parties such as the Tax Office 3. Big data = machine generated data of events caused by human activity Mobile phones Road sensors Social media 3

Current output StatLine: a large database (http://statline.cbs.nl) More than one billion (10 9 ) facts in more than 3000 stand-alone tables Output statistics contain uncertainty: published only rarely A few interactive visualizations (www.cbs.nl)

StatMine Interactive visual analysis layer on top of StatLine Target population: Policy makers, Journalists, Citizens, Enterprises, Economists, Social scientists, Historicians, etc Goals: Facts should be presented visually and interactively Users should be able to combine tables Present uncertainty understandable to users StatMine will soon be available in public 5

Line chart - development Bar chart - compare Bubble/scatter chart - correlation Mosaic chart - structure 6 StatMine 0.2

Uncertainty research line chart types 7

Uncertainty research bar chart types Chisel chart Cigarette chart 8

Uncertainty research - user study results Showing uncertainty improves validity of user statements Line chart: - With point estimate: ribbon - Without point estimate: error bars Bar chart: - With point estimate: chisel/cigarette - Although users prefer bar chart with error bars - Without point estimate: chisel/cigarette Users appreciate uncertainty intervals and are able to interpret graphs with uncertainty intervals. Reference: Laan, D. van der, Jonge, E. de, Solcer, J. (2015), Effect of Displaying Uncertainty in Line and Bar charts Presentation and Interpretation, Proceedings IVAPP 2015, Berlin. 9

Visualization of Large Datasets Goal: to empower data analysts with visual tools to explore (large) raw datasets, and to examine the data during statistical processes. Software: R Python Javascript d3 10

Tableplot R-package tabplot Dutch Virtual Census, 2011 test file 11

Tableplot R-package tabplot Structural Business Statistics: raw survey data (sorted by turnover) 12

Tableplot R-package tabplot Structural Business Statistics: edited survey data (sorted by turnover) 13

Tableplot R-package tabplot How representative is our survey sample? Analysis of demographics when sorted by calibration weight 14

Treemap R-package treemap Structural Business Statistics: aggregated by economic activity 15

Heatmap R-package ggplot2 Mobile phone metadata (raw): number of unique devices 16

Heatmap Python + HTML/Javascript Interactive tool to analyse income data 17

Small multiples R-package ggplot2 Analysis of Daytime Population estimates based on mobile phone metadata 18

Thematic Maps R-package tmap Interchange with traffic sensors 19

Thematic Maps R-package tmap R package tmap: Layered maps Polygons Lines Points Raster ggplot2 style small multiples Open Street Map @ user! 2015 Thursday 13:00 Population density 20

Thematic Maps R-package tmap The relation between metropolitan areas and income class @ user! 2015 Thursday 13:00 21

Thematic Maps R-package tmap Global land cover (urban areas accentuated with dot map) @ user! 2015 Thursday 13:00 22

Maps Python + HTML/Javascript Interactive tool to analyse traffic on Dutch highways 23

Summary Data visualization is essential in Official Statistics for Exploring new data sources Analysing new deliveries of existing data sources Analysing data throughout the statistical production process Presenting the data (to collegues, policy makers, and the general public) Need for Visualization of confidence intervals Interactive data exploration tools Big data visualization 24