Using Statistical data formats in visualization



Similar documents
OECD.Stat Web Browser User Guide

Improving the visualisation of statistics: The use of SDMX as input for dynamic charts on the ECB website

itunes Store Publisher User Guide Version 1.1

The ECB Statistical Data Warehouse: improving data accessibility for all users

Designing Mobile Applications for Official Statistics

Data Visualization on Istat's Web site. Giulia Mottura Vincenzo Patruno

How is it helping? PragmatiQa XOData : Overview with an Example. P a g e Doc Version : 1.3

Creating an Enterprise Reporting Bus with SAP BusinessObjects

<Insert Picture Here> Introducing Data Modeling and Design with Oracle SQL Developer Data Modeler

Magento module Documentation

A Tutorial on dynamic networks. By Clement Levallois, Erasmus University Rotterdam

There are various ways to find data using the Hennepin County GIS Open Data site:

Module 9 Ad Hoc Queries

Conference on Data Quality for International Organizations

SDMX Connectors: using SDMX data in statistical packages and tools (EXCEL, R, Matlab, SAS)

Using HP ArcSight API for data visualization

TOP New Features of Oracle Business Intelligence 11g

DKAN. Data Warehousing, Visualization, and Mapping

ArcGIS Pro. James Tedrick, Esri

ZOINED RETAIL ANALYTICS. User Guide

Portal Version 1 - User Manual

Bulk Upload Tool (Beta) - Quick Start Guide 1. Facebook Ads. Bulk Upload Quick Start Guide

Geospatial Information in the Statistical Business Cycle 1

Welcome to the Data Visualisation & Reporting Stream

Designing Reports in Access

Google Apps for Sharing Folders and Collecting Assignments

123RF Corporate+ User s Guide

Aspose.Cells Product Family

MIS Export via the FEM transfer software

Aerie Help Desk App. User Guide. Aerie Consulting, LLC 110 West Canal Street Winooski, VT September 14, 2015 Version 1.0.1

August 2014 San Antonio Texas The Power of Embedded Analytics with SAP BusinessObjects

Tool for Automated Provisioning System (TAPS) Version 1.2 (1027)

FAQs for Open Payments Mobile for Physicians &

MODIFYING QUICKBOOKS REPORTS

Release Notes Feature Release

Project Plan 365 Collaboration with Microsoft Project Files (MPP) Shared on Network Folders

The Reporting Console

Teacher Activities Page Directions

Nonprofit Technology Collaboration. Web Analytics

Sisense. Product Highlights.

QStar Network Migrator - Storage Reporter

Your No-Nonsense Guide to Facebook Ads

Linked Statistical Data Analysis

Geovisual Analytics Exploring and analyzing large spatial and multivariate data. Prof Mikael Jern & Civ IngTobias Åström.

ProExtra eclaiming User Guide

DESIGN THINKING FOR VISUALIZATION

1. Layout and Navigation

United States Department of Agriculture (USDA) Agricultural Marketing Service (AMS) Livestock and Grain Market News (LGMN)

Messaging Dashboard Quick Reference Guide

Your Virtual Workforce. On Demand. Worldwide. COMPANY PRESENTATION. clickworker GmbH 2015

Gardners ebooks frequently asked questions

Oracle BI 11g R1: Create Analyses and Dashboards

Wave Analytics Platform Setup Guide

City Data Pipeline. A System for Making Open Data Useful for Cities. stefan.bischof@tuwien.ac.at

Data Driven Success. Comparing Log Analytics Tools: Flowerfire s Sawmill vs. Google Analytics (GA)

Canadian Association for Research Libraries Toronto, Ontario 14 October 2015

Analyze Your Data. Salesforce, Winter

Data Sheet: Work Examiner Professional and Standard

Helpdesk manual. Version: 1.1

Package pdfetch. R topics documented: July 19, 2015

ithenticate User Manual

World Trade Analysis

The 8 Key Metrics That Define Your AdWords Performance. A WordStream Guide

Statistical Data Quality in the UNECE

Team Members: Christopher Copper Philip Eittreim Jeremiah Jekich Andrew Reisdorph. Client: Brian Krzys

HOW TO USE DATA VISUALIZATION TO WIN OVER YOUR AUDIENCE

TIBCO Spotfire Business Author Essentials Quick Reference Guide. Table of contents:

Upon Installation, Soda

COGNOS 8 Business Intelligence

SonicWALL GMS Custom Reports

De La Salle University Information Technology Center. Microsoft Windows SharePoint Services and SharePoint Portal Server 2003 READER / CONTRIBUTOR

Netezza Workbench Documentation

EA104 World Premiere of SAP BusinessObjects Design Studio. Eric Schemer, Senior Director Product Management, BI Clients, SAP AG October, 2013

GMC Inspire Cloud Services

ithenticate User Manual

CRGroup Whitepaper: Digging through the Data. Reporting Options in Microsoft Dynamics GP

Comparative Analysis Report:

Collections MAX Screen Pop Web Service

with its unique General Ledger Drill Down feature, for any accounting software.

SAP BO Course Details

Storytelling with Maps: Workflows and Best Practices

MEDIAplus administration interface

MicroStrategy Desktop

TechTips. Connecting Xcelsius Dashboards to External Data Sources using: Web Services (Dynamic Web Query)

Promoting Your Location Platform

Business Objects XI/R3.1 Corporate Training

COGNOS (R) 8 Business Intelligence

Internal User Guide. AECsoft USA, Inc 1776 Yorktown Ste 435 Houston, TX

Transcription:

Using Statistical data formats in visualization

Background Statistics explorer: Generic statistics visualization

Background Focus is on visualization, but that is useless without data and data is useless without an easy way to load it.

Background

Background

Background Data Providers Loaded Indicators Selected Indicators

Background Data loading demo Start off on a bright note Download PC-Axis from SCB Load directly into Statistics explorer or Mdim explorer http://www.scb.se/pages/listwide 259087.aspx http://www.ssd.scb.se/databaser/makro/visavar.asp? yp=duwird&xu=c5587001&lang=1&langdb=1&fromw here=s&omradekod=be&huvudtabell=befolkningny &innehall=folkmangd&prodid=be0101&deltabell=k2 &fromsok=&preskat=o

Background To make our tool useful, it needs: Support the most common formats Combine data from different sources Load data in a intuitive way Should be easy to understand WHY data is loaded in a specific way Tell the user what is wrong with their data

Background To make our tool useful, it needs: Support the most common formats Combine data from different sources Load data in a intuitive way Should be easy to understand WHY data is loaded in a specific way Tell the user what is wrong with their data

Formats Generic Formats Excel txt CSV Statistics Formats PC-Axis SDMX

Generic Formats User are guided to use our structure Simpler to have special additions like categorical data and groupings Proper error management and feedback goes a long way Make sure the user knows what is wrong Limits the user to supported structures Their export format either needs specific support OR they need to edit their files Problematic to keep track of and update data

Excel: Categorical Example Categorical Numerical

Excel: Categorical Numerical Categorical

Excel: Categorical Treemap Numerical Categorical

Excel: Categorical Color Map Numerical Categorical

Statistics Formats Strictly structured Has identifiable properties that can be used by our tools Dimensions Values Time Meta data

Statistics Formats Exported data can directly be used in tools which support the format No need for editing or changing data bases as long as they support proper export mechanisms Potentially much simpler to update and manage the tools data.

Common issues - Notation Contents Spatial Countries, Regions Extra important if the tool uses a map Identified in different ways depending on the publisher, language and data set. region, country, geo, cou, location etc. Usage of codes and/or names differs as well ISO 2/3, local code systems, only names

Common issues - Notation Contents Spatial Need to prompt the user to identify the spatial dimension PC-Axis prompt in Statistics explorer, Reading a Finnish language PC-Axis file SDMX Load interface in Statistics explorer, Loading fields for both files, along with location identifier

Common issues - Notation Contents Spatial Problem do exist for other formats as well, but there are fewer options Prompt when reading an Excel file with data on both sheets and columns, where they couldn t be correctly identified.

Common issues - Notation Contents Time 2012-05-31 05-31-2012 Q2-2012 2012-Q2 January, February Etc.. Our tools currently don t care, they only assume it can be sorted alphabetically. Plans on using proper Date standards exist, but there are many localization issues.

Common issues - Notation Contents Dimensions Any number of value dimensions Gender: Men, Women Population: Age 0-14, Age 15-64, Age 65+ Title and Description fields How should these be combined in the application?

Common issues - Notation

Common issues Notation - Example How the structure of PC-Axis is used in explorer: TITLE: Title of the file CONTENTS: Contents of the file STUB: dimensions HEADING: dimensions VALUES: Contains the content of dimensions DESCRIPTION: Description of the file

Common issues Notation - Example Example TITLE: Population numbers by gender CONTENTS: Population STUB: regions HEADING: gender, time VALUES( gender )= Men, Women VALUES( time )= 2000, 2001, 2002 VALUES( region )= Norrköping, Linköping Name of the indicators would be: Population, Men and Population, Women

Common issues - Notation- Example Example from SCB TITLE: Statistics focused on sick leave numbers by region, time and value CONTENTS: Statistics focused on sick leave STUB: regions, variables HEADING: time, indicators VALUES( variables )= Total, Men, Women VALUES( indicators )= Sick leave, days, Percentage who contributes to sick leave, per cent" Name of the indicators would be: Total, Sick leave, days, Statistics focused on sick leave

Common issues - Notation- Example Leaves work for the user, to make sure their file has a structure that fits what we do. Being more flexible in the tool could help, but make it more complex to read data.

Common issues Usage of special characters () ; All cases has to be correctly identified Quite possible and simple, but time consuming

SDMX Our tools can read: SDMX-ML: XML based format It needs two files: DSD: Data structure definition Data Location/regional dimension has to be identified We use an Open Source project: flex-cb, previously developed by ECB.

SDMX OECD: DotStat integration explorer component viewer: Single view app. Integrated into the database Allows direct viewing of data in our graphs User select data Query URL OECD web service SDMX data

SDMX Testing with SCB and Eurostat Evaluating usage of SDMX For regular users? What kind of files are suitable Usually very large files, for database communication Finding bugs No SDMX implementation seems to be the same Both in our reader and the export functionality

SDMX Often completely irrelevant to the normal user Extremely powerful for technical users Hard to use, but better tools will solve this

Web services Best way of acquiring data for normal users Format is irrelevant, black-box approach Example: World databank

Web services Standards? World databank uses its own API and data format

Wrapping up Most common format is Excel Statisticians don t want a black box format Harder to detect errors in files PC-Axis used by a certain group of people They are usually experienced with PC-Axis editing. SDMX is only used by technical experts Used for data export and webservices Quite heavily promoted From our point of view it s hard to know the focus of it Mostly used for large files, transferred between databases

Wrapping up Need more structure? Not at all! A flexible system will always be better Guidelines are important Usage of codes and structures Know your audience Make sure they have options on data structure, and that it is clear how to reach it.