This survey addresses individual projects, partnerships, data sources and tools. Please submit it multiple times - once for each project.



Similar documents
Results of the UNSD/UNECE Survey on. organizational context and individual projects of Big Data

Questionnaire about the skills necessary for people. working with Big Data in the Statistical Organisations

Privacy Policy. PortfolioTrax, LLC v1.0. PortfolioTrax, LLC Privacy Policy 2

Overview of edx Analytics

Student Project 2 - Apps Frequently Installed Together

HC SHAREPOINT SERVICES MODULE SERVER CONFIGURATION. User Manual. Hosting Controller All Rights Reserved.

Questionnaire on the European Data-Driven Economy

The Sandbox 2015 Report

ISO IEC ( ) INFORMATION SECURITY AUDIT TOOL

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Up to 5 pages - Static site $ $84.00 $12.00

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

Profound Outdoors Privacy Policy

Passenger Information Systems: What Transit Agencies Need to Know

How To Write A Health Care Plan

How-To: Submitting PDF forms to SharePoint from custom websites

ARRIS WHOLE HOME SOLUTION PRIVACY POLICY AND CALIFORNIA PRIVACY RIGHTS STATEMENT

Managing Qualys Scanners

Report of the 2015 Big Data Survey. Prepared by United Nations Statistics Division

The 4 Pillars of Technosoft s Big Data Practice

Design of Data Management Guideline for Open Data Implementation

Guide to the Installer Application

Integrating a Big Data Platform into Government:

Getting Started with AWS. Static Website Hosting

Analyzing HTTP/HTTPS Traffic Logs

TERMS OF REFERENCE (TORs)

North American Emission Control Area. Electronic Fuel Oil Non-Availability Disclosure Portal (FOND) Instructions

Big Data and Official Statistics The UN Global Working Group

Nevada NSF EPSCoR Track 1 Data Management Plan

DATA PRIVACY SAFEGUARD PROGRAM DATA MANAGEMENT PLAN REVIEW CHECKLIST EVALUATION GUIDE

DATA SCIENCE CURRICULUM WEEK 1 ONLINE PRE-WORK INSTALLING PACKAGES COMMAND LINE CODE EDITOR PYTHON STATISTICS PROJECT O5 PROJECT O3 PROJECT O2

Fahad H.Alshammari, Rami Alnaqeib, M.A.Zaidan, Ali K.Hmood, B.B.Zaidan, A.A.Zaidan

Use of social media data for official statistics

Data for the Public Good. The Government Statistical Service Data Strategy

DPD shipping module documentation. Magento module version 2.0.3

Green Pharm is committed to your privacy. We disclose our information practices below and we agree to notify you of:

Measurabl, Inc. Attn: Measurabl Support 1014 W Washington St, San Diego CA,

big data in the European Statistical System

McZeely Coterie, LLC Privacy Notice. Effective Date of this Privacy Notice: February 11, 2015.

The Information Commissioner s Office response to HM Treasury s Call for Evidence on Data Sharing and Open Data in Banking

Item rd International Transport Forum. Big Data to monitor air and maritime transport. Paris, March 2016

Bodywhys Privacy Policy

Information Security Awareness Training

Getting Started with AWS. Hosting a Static Website

Kaltura On-Prem Evaluation Package - Getting Started

Computer Programming for the Social Sciences

QUICK START GUIDE. Cloud based Web Load, Stress and Functional Testing

Contact: Cory-Ann Wind,

NBA Math Hoops Privacy Statement and Children s Privacy Statement Updated October 17, 2013.

Your child s lawyer. Court-appointed lawyer for the child in cases deciding on care of children

Documentation to use the Elia Infeed web services

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

SINTERO SERVER. Simplifying interoperability for distributed collaborative health care

D5.5 Initial EDSA Data Management Plan

Norton Mobile Privacy Notice

SRT210 Lab 01 Active Directory

SKoolAide Privacy Policy

Big data coming soon... to an NSI near you. John Dunne. Central Statistics Office (CSO), Ireland

ONLINE EXTERNAL AND SURVEY STUDIES

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

Unless otherwise stated, our SaaS Products and our Downloadable Products are treated the same for the purposes of this document.

BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business

IMPLEMENTING PREDICTIVE ANALYTICS USING HADOOP FOR DOCUMENT CLASSIFICATION ON CRM SYSTEM

Collaborative Open Market to Place Objects at your Service

Universal Health Record Patient Access v2.2.4 User Guide

Snow Agent System Pilot Deployment version

Big Data Executive Survey

SA4 Software Developer Survey Survey Specification v2.2

Mobile Marketing Survey Report Q1 2014

FROM DATA STORE TO DATA SERVICES - DEVELOPING SCALABLE DATA ARCHITECTURE AT SURS. Summary

WEBSITE PRIVACY POLICY. Last modified 10/20/11

ETS. Major Field Tests. Proctor Administrator Manual

Welcome to the Privacy and Security PowerPoint presentation in the Data Analytics Toolkit. This presentation will provide introductory information

Transcription:

Introduction This survey has been developed jointly by the United Nations Statistics Division (UNSD) and the United Nations Economic Commission for Europe (UNECE). Our goal is to provide an overview of active Big Data projects in Official Statistics in order to facilitate a more informed discussion. The survey has two focuses: sharing broad information about potential Big Data projects in the statistical community and sharing specific information about partnerships, data sources, and tools. This survey addresses individual projects, partnerships, data sources and tools. Please submit it multiple times - once for each project. For this survey a fairly wide definition of what "Big Data" is has been adopted: Big data are data sources with a high volume, velocity and variety of data, which require new tools and methods to capture, curate, manage, and process them in an efficient way. The UNECE working classification of types of big data may also help define the range of potential sources of big data being considered. This is a working classification, and is not expected to be complete, so if you find a missed area please let us know. The survey is meant for projects at every stage of development. If your project is still in the idea phase we would like to hear about it and the data sources and partnerships you are exploring. Just leave any area that is not relevant to you blank. At the start of the survey you will have a chance to let us know how widely you are able to share the submitted information. At a minimum all information submitted will be shared between the survey authors and used in aggregate or anonymous form at the upcoming International Conference on Big Data for Official Statistics in Beijing and in reports to the UNECE High-Level Group for the Modernisation of Statistical Production and Services. If you have any information you would rather email directly, or have a question email tradestat@un.org. Questions may also be submitted online at the Big Data Inventory Q&A page. Thank you for your time and participation. PLEASE NOTE: submission for this survey is online only. This PDF copy is only for reference. Submit answers at: https://www.surveymonkey.com/s/bigdataproject Thank you for your time and participation. Page 1

Organizational Information Organization: If there are multiple organizations, then the one leading the project. Division: If applicable, the division or subunit of the organization doing the work. Country: Point of contact: Name: Position: E-mail: Can we share your organization and project title publicly? Yes, you may share it publicly [published openly online] Yes, you may share it with organizations participating in the survey [published online behind password] No, do not share this information except in aggregate / anonymous form Can we share the detailed information you submit? Please be as open as possible. We are collecting this information primarily to help the wider official statistics community have an informed discussion. If there are a few details you would like to keep confidential you may submit them by email instead of including them in this survey. Yes, you may share it publicly [published openly online] Yes, you may share it with organizations participating in the survey [published online behind password] No, do not share this information except in aggregate / anonymous form Further comments: Page 2

Project Information Project title: A descriptive title for the project or proposed project. If no official title has been chosen then something that communicates the main idea. Project status: Idea phase [skip to page 6] Proposed (in planning - not yet approved or funded) Approved (approved - not yet funded) Funded (approved and funded - not yet started) Ongoing (in execution phase) Completed Page 3

Potential areas of use for this project: Select all that apply. Demographic and social statistics (including subjective well-being) Economic and financial statistics Environmental statistics Information society / ICT statistics Labour statistics Mobility statistics Price statistics Tourism statistics Transportation statistics Vital and civil registration statistics Other domains of official statistics Would you qualify the project as: Exploratory / research Pilot with a goal of moving it to production if successful For the production of statistics Other (please specify) Page 4

Project overview: Include broad information about your project objectives and scope with an emphasis on the implications for official statistics. Also indicate whether the project is primarily for research purposes or for production of statistics based on Big Data. 1-3 paragraphs Page 5

Project Information Outcomes (for incomplete projects include project goals): A summary of the results or desired results of the project with an emphasis on the implications for official statistics. When discussing actual outcomes, please note how detailed the project output, e.g. coordinate (GPS), regional, or national information updated daily, monthly, or annually. 1-2 paragraphs Most important lessons learned so far in the project: These might have to do with methodological issues, project management, training personnel, how to get funding, the technical tools used in the project, or something else entirely. Essentially the largest challenges you have faced so far and how you have (or plan to) overcome them. 1-2 paragraphs Page 6

Project Information Future directions: For completed projects: what are your next steps? For projects still in the early stages: discuss upcoming plans and challenges. Detailed questions about partnerships and data sources appear later in the questionnaire. 1-2 paragraphs Page 7

Partnerships Do you have any partnerships with other organizations or data providers on this project? The partnerships may still be in the very early stages. Yes No [skip to page 10] Page 8

Partnerships Please discuss any arrangements you have with your primary partner organization. If you have more than one partner on this project please discuss them in the other comments space at the end of the partnerships section. Name of partner: If you do not wish to disclose the name, please supply a working label - e.g. "Partner - Mobile Phone Data Provider". Have you already discussed this partner when submitting information about a different project? There is no need to enter partner information again if you already have done so on another project - you may leave the rest of this section blank. But if there are details about the partnership that were specific to this project that you'd like to provide you may do so. Yes (skip the partnerships section) Yes (do not skip) No [skip to page 10] If yes please specify the project title: Page 9

Partnerships Type of partner organization: Select all that apply. International Organization Government Commercial NGO Academia Other (please specify) Type of partnership: Select all that apply. Data provider Data consumer / data aggregator (not first origin of data sources) Design partner Technology partner Analytical partner Other (please specify) Current status of the partnership: We understand that forming a partnership may not fit cleanly into these categories. Please include further details if required in the 'Other comments' section below. In discussion Prototyping / Testing (some data partners allow this before a contract is signed) Contract in place Other (please specify) Page 10

Are there any payments or financial arrangements with this partner? Yes No Not applicable / Do not wish to share Details of the financial arrangements: Other comments: Please discuss the organizational arrangements and the history of the partnership if applicable. If you have other partners on this project you may discuss them here. 1-2 paragraphs Page 11

Data sources Do you have any data sources for this project? Yes, we already had the data in our organization [skip to page 12] Yes, we have identified a new source and received the data [skip to page 12] Yes, we have a new source and are in discussions with the data provider to obtain the data Yes, we have identified a new source, but no discussion with the data provider has taken place No specific source has been identified yet Page 12

Data sources & analysis (idea / discussion phase) If there are sources that have been explored, but you still do not have data please discuss them here: Please discuss your planned data analysis tools and skills: For instance, are you considering using R, SAS, Python or other tool(s) for analysis? What tools are you already familiar with? What are you considering for the data store - local files, hadoop, a nosql database, or a traditional relational database? Is your preference to run this on your own infrastructure, or on external infrastructure? Either way, what challenges do you face? [SKIP TO PAGE 15 - FINAL COMMENTS] Page 13

Data sources Name of data source: Have you already discussed this data source when submitting information about a different project? There is no need to enter the information again if you already have done so on another project - you may leave the rest of this section blank. But if there are details about the data source that were specific to this project that you'd like to provide you may do so. Yes (skip the data sources section) Yes (do not skip) No [skip to page 14] If yes please specify the project title: Page 14

Data sources Data source description: A brief description of the data source. Type of Big Data: Choose the most specific category that describes your data source. List does not appear in PDF See: http://www1.unece.org/stat/platform/display/bigdata/classification+of+types+of+big+data Who is the provider of the data source? What is the geographical scope of the data source? Local Regional National International Other (please specify) Page 15

How granular is the information in the data source? This should correspond to unit of time used to mark individual records. For instance, a weather station might have a timestamp associated with each observation. But in the data set from the provider the data may be aggregated and averaged by hour. If multiple levels of granularity are available specify the most detailed and describe the mix in the data description. Timestamp (seconds, milliseconds, or more specific) Minutes Hours Days Weeks Months Years Other (please specify) How frequently are data source updates made available? You may not consume each update, but the updates are made available for consumption. If the data source falls between a category choose the higher frequency category, e.g. a data source that posts updates every half hour can be considered constant. Constantly Hourly Daily Weekly Monthly Quarterly Annually Nearly static (highly infrequent / no schedule) Other (please specify) Page 16

Have you established automatic links for transmitting this data source (e.g. API, automatic file download)? Yes No Other (please specify) Links to the data source (if available): If available include both the data source and a link to any data documentation. If there aren't public links but you would like us to host the files please email tradestat@un.org. Data (URL): Documentation (URL): Is this data source publicly available? Yes - accessible to everyone in an easy to use format (CSV, XML, JSON, API, Excel, etc.) Yes - accessible to everyone, but requires significant work to reformat (e.g. PDF, screen scraping, etc.) No - requires explicit permission and is not publicly posted Are there any privacy and confidentiality issues related to this data source? If yes, please provide details about how you have addressed those issues. For instance, did you remove personal characteristics or change the geographic scope of the data? Was this done by you or by the provider? Did this degrade the usefulness of the data for analysis? No Yes (please give details): Page 17

Any other comments about this data source or data provider: Some topics to consider addressing are... - What were the largest limitations in working with this data source and how did you overcome them? - What were the most useful levels of aggregation? - What were the greatest challenges you had working with the data? Page 18

Data analysis, tools and skills Do you integrate traditional data sources with the new "Big Data" source discussed above? No Yes (please give details): In your project, what technologies, methods and tools did you use during the Big Data processing life cycle? e.g. the SVM implementation in python/scikit-learn to identify likely tourists, and hadoop / mapreduce for preprocessing aggregation. Hosting provider and/or partner: Did you use a 3rd party, such as Amazon, deploy on your own servers or share resources with a partner organization? If you are comfortable sharing it, approximately how much did this cost? Page 19

Final comments Do you have any other comments you would like to share? Page 20