Ensuring effective and smooth flow of data throughout the data life cycle



Similar documents
National Statistics Code of Practice Protocol on Data Management, Documentation and Preservation

Use of the IHSN Microdata Management Toolkit to Document Agricultural Census Data

A grant number provides unique identification for the grant.

Statistics on E-commerce and Information and Communication Technology Activity

Privacy Policy MacID. Document last updated Sunday, 28 December 2014 Property of Kane Cheshire

Principles and Guidelines on Confidentiality Aspects of Data Integration Undertaken for Statistical or Related Research Purposes

Data quality and metadata

Generic Statistical Business Process Model

The Department for Business, Innovation and Skills IMA Action Plan PRIORITY RECOMMENDATIONS

Geospatial Information in the Statistical Business Cycle 1

Conference on Data Quality for International Organizations

WHEN YOU CONSULT A STATISTICIAN... WHAT TO EXPECT

Measurabl, Inc. Attn: Measurabl Support 1014 W Washington St, San Diego CA,

ESRC Research Data Policy

National Drug Treatment Monitoring System (NDTMS) Statement of Compliance. With the National Statistics Code of Practice and Protocols

Data for the Public Good. The Government Statistical Service Data Strategy

Vermont Behavioral Risk Factor Surveillance System. Strategic Plan and Performance Measures

Statistical Data Quality in the UNECE

Guidelines for the Template for a Generic National Quality Assurance Framework (NQAF)

BC Geographic Warehouse. A Guide for Data Custodians & Data Managers

1. INFORMATION WE MAY COLLECT FROM YOU 1.1 We may collect and process the following data about you:

Privacy Policy. When you create an account or use our Service, we collect the following types of information from you:

Privacy Policy. Effective Date: November 20, 2014

2. Issues using administrative data for statistical purposes

Interagency Science Working Group. National Archives and Records Administration

Digital Continuity in ICT Services Procurement and Contract Management

REACCH PNA Data Management Plan

PRIVACY POLICY. I. Introduction. II. Information We Collect

Questionnaire on the European Data-Driven Economy

Technical Competency Framework for Information Management (IM)

International Open Data Charter

Introduction to Quality Assessment

Monitoring and Reporting Drafting Team Monitoring Indicators Justification Document

GSBPM. Generic Statistical Business Process Model. (Version 5.0, December 2013)

Privacy Policy. What is Covered in This Privacy Policy. What Information Do We Collect, and How is it Used?

IBX Business Network Platform Information Security Controls Document Classification [Public]

The ECB Statistical Data Warehouse: improving data accessibility for all users

SAP Splash Privacy Statement

Earth Science Academic Archive

1. TYPES OF INFORMATION WE COLLECT.

Information Sharing Protocol

Action Plan Checklist

Quick Reference Guide for Data Archivists

ECCnet Item Certification Get Started Checklist. Version 2.0

Data Flow Organising action on Research Methods and Data Management

IOM Data Privacy and Accuracy Policy

NSW Government Open Data Policy. September 2013 V1.0. Contact

Assessment of Child and Working Tax Credit Statistics produced by HM Revenue & Customs. Assessment Report 30

The Second National HIPAA Summit

ConteGoView, Inc. Privacy Policy Last Updated on July 28, 2015

Regulations for Data Access

Document Control. White Paper By Galaxy Consulting. At Your Service Today Tomorrow We Appreciate The Privilege Of Serving You! Abstract.

Metadata, Electronic File Management and File Destruction

General concepts: DDI

Collaboration in Data Documentation: Developing STARDAT - The Data Archiving Suite

DFS C Open Data Policy

Effective complaint handling

Checklist and guidance for a Data Management Plan

HLG Initiatives and SDMX role in them

Survey of Canadian and International Data Management Initiatives. By Diego Argáez and Kathleen Shearer

Information Management Advice 61 How to review your records holdings

PROJECT MANAGEMENT PLAN CHECKLIST

Archiving, Retrieval and Analysis The Key Issues

Planning Services. Customer focus strategy westlothian.gov.uk

Documentation of statistics for Radio and TV, Consumption 2014

Assessment of compliance with the Code of Practice for Official Statistics

Data Protection Act Guidance on the use of cloud computing

DGD ACT Health Data Quality Framework

ETHICAL ELECTRIC PRIVACY POLICY. Last Revised: December 15, 2015

Communications Strategy

CHAPTER 3. Research methodology

Software Requirements Specification for DLS SYSTEM

Mobilebits Inc. Privacy Policy

Country Paper: Automation of data capture, data processing and dissemination of the 2009 National Population and Housing Census in Vanuatu.

Checklist for ECM Success 14 Steps

Project Plan DATA MANAGEMENT PLANNING FOR ESRC RESEARCH DATA-RICH INVESTMENTS

High-level Workshop on Modernization of Official Statistics. Common Statistical Production Architecture

NERC Biodiversity and Ecosystem Service Sustainability (BESS) Data Management Strategy

Thank you for visiting this website, which is owned by Essendant Co.

Recommendations for the PIA. Process for Enterprise Services Bus. Development

Content Sheet 13-1: Overview of Customer Service

Official Journal of RS, No. 86/2006 of REGULATION

Last updated: 30 May Credit Suisse Privacy Policy

occasional paper on economic statistics SINGAPORE HOUSEHOLD BALANCE SHEET: 2005 UPDATE AND RECENT TRENDS

Transcription:

Ensuring effective and smooth flow of data throughout the data life cycle Standards, practices and procedures Olivier Dupriez World Bank / IHSN Eighth Management Seminar for the Heads of National Statistical Offices in Asia and the Pacific (3 5 November 2009)

The great thing about standards is that there are so many to choose from... 2

The data life cycle Study design Data collection Data processing and analysis Dissemination Preservation Feedback Documentation 3

DESIGN Collection Processing Dissemination Preservation RELEVANCE Count what counts Know who your users are and what their needs are Consultation with users can be formal or informal (solicited or not) Many ways to communicate with users: conferences, visits, correspondence, analyzing feedback, website usage statistics, etc. Engagement with users must be inclusive and coordinated Regularly assess relevance and publish outcome 4

DESIGN Collection Processing Dissemination Preservation http://www.ons.gov.uk/about-statistics/ns-standard/cop/protocols/index.html 5

DESIGN Collection Processing Dissemination Preservation CONSISTENCY, INTEGRATION Data production is fragmented, vehicle-driven. This causes redundancies, inefficiencies, disharmonies Solution: integration Use of common classifications, geographic referencing standards, definitions, questions and instructions across the statistical system (keep the option to diverge from standard, but with clear and explicit justification). Take advantage of international good practices (SNA, etc) Maintain a corporate inventory of holdings (metadata) Integration requires better communication within the system, but makes communication with suppliers and users much easier. 6

DESIGN Collection Processing Dissemination Preservation (Inter)national standards for integration Source of drinking water in country [X]: 9 surveys, 9 different ways of collecting data. Definition of household and of urban/rural also varies from survey to survey. 7

DESIGN Collection Processing Dissemination Preservation Country X - Rural access to improved drinking water sources 50 % population 40 30 Original trend based on 3 DHS surveys Old trend DHS 6 more well documented datasets made available DHS LFS LSS BL CWIQ GS MICS New trend DHS New trend Major revision 20 1980 1985 1990 1995 2000 2005 2010 Year 8

DESIGN Collection Processing Dissemination Preservation Standards and classifications should be accessible on your website, with guidelines for their application. IN SEARCH OF DATA INTEGRATION: NO MATCHES FOUND Gordon E. Priest, Statistics Canada http://www.amstat.org/sections/sgovt/priest.pdf 9

COLLECTION Processing Dissemination Preservation TRUST Respondents will provide more honest information when the trust is high and the burden is low Persuasion is better than obligation Respondents must be informed of the intended uses of the data and be convinced that there is a clear benefit A guarantee must be given to respondents that no statistics will be produced that are likely to identify them unless specifically agreed Laws and regulations do not provide a user friendly set of principles. Important to have a code of practice and related protocols to communicate with data suppliers. 10

COLLECTION Processing Dissemination Preservation http://www.ons.gov.uk/about-statistics/ns-standard/cop/protocols/index.html 11

Collection PROCESSING Dissemination Preservation REPLICABILITY We must know the exact process by which the data were generated and the analysis produced "The replication standard holds that sufficient information exists with which to understand, evaluate, and build upon a prior work if a third party can replicate the results without any additional information from the author. http://gking.harvard.edu/files/replication.pdf Crucial to defend your results, train new staff, etc. Importance of documentation and preservation (must be imposed to all including consultants) 12

Collection PROCESSING Dissemination Preservation CONFIDENTIALITY Everyone must be aware of the obligation to protect confidentiality and of the fact that this obligation continues after completion of service (including consultants) Data identifying respondents will be kept physically secure 13

Collection Processing DISSEMINATION Preservation TIMELINESS Release calendar and arrangements must be open and pre-announced Statistics will be released as soon as practicable once they are judged fit for purpose Release the data to all interested parties simultaneously. Early access only in exceptional circumstances, and not for personal advantage Statistics must be released separately from statements by ministers (and before) Timing not to be influenced by the content of the release 14

Collection Processing DISSEMINATION Preservation ACCESSIBILITY Promote equality of access As far as possible, the price should not be a barrier to access The web is the primary means of providing general access, but other forms of dissemination must be maintained (paper, CD-ROMs, etc) Choice and flexibility in the formats (monitor the demand!); respond to changing expectations 15

Collection Processing DISSEMINATION Preservation QUALITY, CLARITY, USABILITY, PORTABILITY Disseminate data with lots of metadata: To help users understand what the data are measuring and how the data have been created To help users assess the quality of the data Metadata standards and XML technology are convenient ways to ensure completeness and portability of metadata (provide checklist of elements) SDMX for time series data (ISO) DDI for microdata 16

Collection Processing DISSEMINATION Preservation VISIBILITY Metadata also helps users find the data; any cataloguing system is based on metadata Discovery metadata should be made available in a comprehensive catalogue covering all national statistics Monitor the demand. Make use of log files and usage statistics of your website 17

Collection Processing DISSEMINATION Preservation Example: monitoring web usage using Google analytics 18

Collection Processing DISSEMINATION Preservation CONFIDENTIALITY No statistics produced that are likely to identify an individual, unless consent provided by respondent Agency should publish information setting out its arrangements for maintaining confidentiality of data When identifying data are to be given by law, they must be released under the personal responsibility of the national statistician 19

Collection Processing DISSEMINATION Preservation http://www.ons.gov.uk/about-statistics/ns-standard/cop/protocols/index.html 20

Collection Processing DISSEMINATION Preservation 21

Collection Processing DISSEMINATION Preservation 22

Collection Processing DISSEMINATION Preservation SPECIAL ISSUE: MICRODATA DISSEMINATION Publish formal microdata dissemination policy and procedures (agency-level policy and dataset-specific policy) Provide very detailed metadata Anonymize datasets (no direct identifiers; reduced risk by controlling quasi-identifiers) No standard practice Common practices (e.g. USA Working Paper 22) 23

Collection Processing DISSEMINATION Preservation www.ihsn.org Federal Committee on Statistical Methodology, Statistical Policy Working Paper 22 (Revised 2005)- Report on Statistical Disclosure Limitation Methodology http://www.fcsm.gov/working-papers/spwp22.html 24

OPENNESS Collection Processing Dissemination Preservation FEEDBACK Provide easy way for users to give input and feedback Welcome comments, even criticism and complaints Respond (preferably openly) to enquiries Record and analyze feedback Data producer can also provide feedback to users, especially by commenting on erroneous interpretation and misuse of statistics. 25

Collection Processing Dissemination PRESERVATION COMMUNICATION WITH FUTURE GENERATIONS OF USERS AND STAFF Data are non-renewable (irreplaceable ) resources. Statistical agencies must ensure their most effective use by present and future generations IT gives a false sense of security against loss A preservation policy is needed to ensure that data and metadata are preserved against hardware or software obsolescence, media failure, and other physical threats Preserving digital information demands constant attention 26

Collection Processing Dissemination PRESERVATION www.icpsr.umich.edu/dpm/ www.icpsr.umich.edu/dp/policies/ http://www.ons.gov.uk/about-statistics/nsstandard/cop/protocols/index.html www.ihsn.org 27