Data quality Vision at SBBr Danny Vélez



Similar documents
RESPONSE FROM GBIF TO QUESTIONS FOR FURTHER CONSIDERATION

Information technology to assist in conserving and using crop wild relatives and landrace diversity (the boring version without pictures)

Copyright Soleran, Inc. esalestrack On-Demand CRM. Trademarks and all rights reserved. esalestrack is a Soleran product Privacy Statement

Name-based Approach to Build a Hub for Biodiversity LOD

Florida Statewide Digital Initiative: Digital Action Plan

Digitization in the Pacific. Larry M. Page PD, idigbio Curator, FLMNH

On the way to best practice in Data Management: Approaches of the UFZ and the LTER- Europe network (Long Term Ecosystem Research)

Environment and Natural Resources Trust Fund 2016 Request for Proposals (RFP)

Data Registry Workshop Report

The Knowledge Sharing Infrastructure KSI. Steven Krauwer

UCM-MACB 2.0: A COMPLUTENSE UNIVERSITY VIRTUAL HERBARIUM PROJECT

Digital Public Library of America (DPLA)

EMODnet Biology. bio.emodnet.eu

THIRD REGIONAL TRAINING WORKSHOP ON TAXATION. Brasilia, Brazil, December 3 5, Topic 4

Distributed Data Management in Internet Map Services

Copernicus Climate Change Service C3S. Jean-Noël Thépaut. Funded by the European Union.

MOVING THE CLINICAL ANALYTICAL ENVIRONMENT INTO THE CLOUD

Linked Open Data Infrastructure for Public Sector Information: Example from Serbia

Ecology and Simpson s Diversity Index

DATA SCIENTIST TRAINING FOR LIBRARIANS #DST4L. C. Erdmann Designing Libraries

Products CRM and Business Intelligence for DNA

DITA Adoption Process: Roles, Responsibilities, and Skills

Data Management in NeuroMat and the Neuroscience Experiments System (NES)

Image Data, RDA and Practical Policies

What does a Larry need?

Nodes Portal Toolkit

SharePoint HR and Financial Software

G8 Open Data Charter

ICCD/COP(12)/CST/INF.5

BIOINFORMATICS Supporting competencies for the pharma industry

Globus Research Data Management: Introduction and Service Overview

Creative media and digital activity

Collaboration unites people, information and knowledge

Conceptual Structures, Database Design, and Visualization for Forest Canopy Ecologists. Our Vision

Bringing Strategy to Life Using an Intelligent Data Platform to Become Data Ready. Informatica Government Summit April 23, 2015

Databases & Data Infrastructure. Kerstin Lehnert

OpenAIRE Research Data Management Briefing paper

CA Configuration Management Database (CMDB)

Interactive Information Visualization in the Digital Flora of Texas

Melbourne Principles for Sustainable Cities

Big Data for Investment Research Management

Are You Big Data Ready?

Business Intelligence and Healthcare

SAP Solution Brief SAP ERP SAP Invoice Management by OpenText. Take Control with Invoice Management Software

Virginia Commonwealth University Rice Rivers Center Data Management Plan

MINISTRY OF HEALTH AND LONG-TERM CARE

TRANSFORMING LIFE SCIENCES THROUGH ENTERPRISE ANALYTICS

UNH Strategic Technology Plan

Ellucian BPM Solutions Roadmap

JP1 Version 11: Example Configurations

Organic Data Publishing: A Novel Approach to Scientific Data Sharing

IFS-8000 V2.0 INFORMATION FUSION SYSTEM

The Evolution of MERLOT

Data Quality Dashboards in Support of Data Governance. White Paper

White Paper March Government performance management Set goals, drive accountability and improve outcomes

E-Content Service Group Virtual Meeting. Digital Preservation: How to Get Started

Microsoft Dynamics AX 2012 R2 New Features*

Transcription:

Data quality Vision at SBBr Danny Vélez 4th workshop SiBBr: data quality and ecological data 25-29 August 2014

SiBBr: national level Community Universities NGO s Government agencies Research centers Citizens SiBBr secretariat

Current focus Focus Species fact Species sheets occurrence Species records occurrence records METADATA Ecological data Genomic data

Publication workflow at SiBBr Structuring and Publication Steps Data Structuring with Darwin Core Harvest, aggregate and data-store steps Visualization through portals Robertson et al. (2014) The GBIF Integrated Publishing Toolkit: Facilitating the Efficient Publishing of Biodiversity Data on the Internet. PLoS ONE 9(8): e102623.

Considerations in addressing data issues at SiBBr

Considerations in addressing data issues at SiBBr 1. Data quality, cleaning and correction are the responsibility of the community and cannot be assigned to any one agent in the process. 2. Museum records and even observation occurrences, are all aggregations of records taken at different times and by different collectors. In the digital world, the flow of biological observations can go from observer to end user through multiple digital aggregators. At any node in the flow, errors can be detected, introduced or addressed. Belbin et al., 2013

Considerations in addressing data issues at SiBBr 3. Addressing data errors will involve the aggregators improving their ability to detect and help other agents in the chain to correct errors. These organizations have a responsibility in collaboration with community- to train in data management and deliver automated mechanisms wherever possible to facilitate new processes and tools that will support several aspects of data quality. Belbin et al., 2013

Considerations in addressing data issues at SiBBr 4. We need an effective way to support and stimulate the feedback of data users and experts. 5. We need to improve our historical approaches that managed only paper-based information to one where all relevant information is generated, managed and curated in a fully interlinked form.

Data paper What it is: Scholarly publication of searchable metadata document describing a dataset, or a group of datasets Provide a mechanism of data quality control Promote and publicize existence of data Provide scholarly credit to data publishers through citable journal publications Describe the data in a structured human-readable form Vishwas, 2011

Data quality activity of SiBBr secretariat? Where? how? Structuring and Publication Steps Harvest, aggregate and data-store steps Visualization through portals Giving and receiving training Creating, using and adapting mechanism and tools of data quality and cleaning for publishers, users and SiBBr secretariat National and international cooperation with others agents of the publication workflow SiBBr secretariat is not going to do data cleaning on the data

Advances Training and manuals Principles of Data Quality translation to Portuguese Darwin Core translation to Portuguese Starting the training program on data structuring and publication Workshops on data quality Tools and mechanisms Tool for data sets completeness and quality visualization Data portal allowing data visualization and feedback

For the next future Consolidate both channels and mechanism for national and international cooperation Consolidate the SiBBr training programs including workshops on data structuring, data paper and data users Development and adaptation of data quality tools and manuals for according with the Brazilian community necessities

Data quaility vision at SiBBr As part of the national and international community interesting in biodiversity data, the SiBBr will contribute to the structuring, integration, publication and using of biodiversity data of quality. In the next years SiBBr will become a referent in the implementation of mechanisms, tools and training process to help other agents of the publications workflow to address data quality issues at the national and international levels.

Thanks! Danny Vélez dvelez@lncc.br dannyvelezv@gmail.com