Introduction to IODE Data Management Greg Reed Past Co-Chair IODE
Outline Background - Introduction to IOC and IODE - IODE activities Oceanographic data management - End to end data management - Data stewardship IODE training curriculum for oceanographic data management
Intergovernmental Oceanographic Commission Intergovernmental Oceanographic Commission - UN body for ocean science, ocean data and information exchange, ocean services Headquarters in Paris, France Field offices - Colombia, Brazil, Thailand, Kenya, Denmark, Belgium, Australia 146 Members States
IOC Objectives Prevention and reduction of the impacts of natural hazards Mitigation of the impacts of and adaptation to climate change and variability Safeguarding the health of ocean ecosystems Promoting management procedures and policies leading to the sustainability of coastal and ocean environment and resources
IOC Oceanographic Data Exchange Policy The timely, free and unrestricted international exchange of oceanographic data is essential for the efficient acquisition, integration and use of ocean observations gathered by the countries of the world for a wide variety of purposes including the prediction of weather and climate, the operational forecasting of the marine environment, the preservation of life, the mitigation of human-induced changes in the marine and coastal environment, as well as for the advancement of scientific understanding that makes this possible. Clause 1 - Member States shall provide timely, free and unrestricted access to all data, associated metadata and products generated under the auspices of IOC programmes.
IOC Strategic Plan June 2013: IOC Assembly endorsed the IOC Strategic Plan for Oceanographic Data and Information Exchange (2013-2016) A comprehensive and integrated ocean data and information system, serving the broad and diverse needs of IOC Member States, for both routine and scientific use.
IOC Data & Information System The IOC Strategic Plan will produce a Data and Information Management System that will deliver: - Assembled, quality controlled and archived data - Timely dissemination of data on a diverse range of variables - Facilitate easy discovery and access to data on a diverse range of variables and derived products For all IOC programmes
IODE Programme IODE: International Oceanographic Data and Information Exchange Established in 1961 to enhance marine research, exploitation and development by facilitating the exchange of oceanographic data and information between participating Member States and by meeting the needs of users for data and information products
IODE Objectives The objectives of the IODE programme are: - To facilitate and promote the discovery, exchange of, and access to marine data and information - To encourage the long term archival, preservation and documentation of marine data and information - To promote the use of best practices for management of marine data, including international standards - To assist Member States to acquire the necessary capacity to manage marine research and observation data - To support international scientific and operational marine programmes including Framework for Ocean Observing (revised IODE-XXII, 2013)
IODE Committee The IODE Programme is managed by the IODE Committee - Meets every 2 years The IODE Committee reports to the IOC Governing Bodies (IOC Assembly and Executive Council) Membership of the IODE Committee includes: - IODE Co-Chairs - IODE National Coordinators for Ocean Data Management - IODE National Coordinators for Marine Information Management
Groups of Experts The IODE Committee has established a number of small groups that provide expert advice to the IODE Committee The following Groups of Experts are currently active: - IODE Group of Experts on Biological and Chemical Data Management and Exchange Practises (GEBICH) - IODE Group of Experts on Marine Information Management (GEMIM) - IODE Group of Experts on the Biogeographic Information System (GEOBIS and SG-OBIS) - Joint JCOMM/IODE Expert Team on Data Management Practices (ETDMP)
IODE: global network National Oceanographic Data Centres & Marine Information Centres - 80 NODCs - 50 national coordinators for Marine Information Management
New structural element - ADU IODE-XXII endorsed new structural element - IODE Associate Data Unit (ADU) ADUs will allow the wider ocean research and observation communities to join IODE network ADUs can be projects, programmes, institutions or organizations that manage oceanographic data. - ADUs do not replace NODCs but will contribute to the objectives of NODCs
IODE NODC National Oceanographic Data Centre (NODC) is a facility for providing ocean data and information in a usable form to a wide user community - National: acquire, process, quality control, inventory, archive and disseminate data - International: international data exchange and dissemination
NODC functions Receive data from national, regional and international programmes Verify the quality of the data (using agreed standards) Ensure long term preservation of data and associated metadata required for correct interpretation of the data Make data available, nationally and internationally (see: IOC Manuals and Guides No. 5)
IODE Activities IODE is responsible for a number of Global and Regional Activities Global activities are implemented by usually by Steering Teams Regional activities focus on capacity development related to oceanographic data and information management through the ODIN network
Ocean Data Standards The IODE Ocean Data Standards Pilot Project (ODS) aims to achieve broad agreement and commitment to adopt a number of standards related to ocean data management and exchange. Three international standards have been recommended: - Recommendation to Adopt ISO 8601:2004 as the Standard for the Representation of Date and Time in Oceanographic Data Exchange - Recommendation to Adopt ISO 3166-1 and 3166-3 Country Codes as the Standard for Identifying Countries in Oceanographic Data Exchange - Recommendation to Adopt the Quality Flag Scheme for the Exchange of Oceanographic and Marine Meteorological Data Other standards under consideration include metadata, platform names and types, keywords (parameters, instruments)
Ocean Data Portal The Ocean Data Portal (ODP) facilitates and promotes the exchange and dissemination marine data and services. The ODP provides the full range of processes including data discovery, access, and visualization. The ODP supports the data access requirements of all IOC programmes areas
Quality Management Framework The main objectives of the IODE-QMF are: - Initiate and review existing Standards, Manuals and Guides with respect to the inclusion of quality management procedures and practices - Offer assistance in establishing organizational quality management systems - Promote accreditation of NODCs according to agreed criteria - Provide regular feed-back to the IODE Committee and IOC Assembly Accreditation of data centres needed to ensure NODCs can provide data of known quality to meet the requirements of a broad community of users
Other IODE Projects GTSPP. Global Temperature-Salinity Profile Program is an international project to develop and maintain a global ocean Temperature-Salinity resource. GOSUD. Global Ocean Surface Underway Data Pilot Project is a global project to collect, process, archive and disseminate sea surface salinity and other variables collected underway by research and opportunity ships. OBIS. Ocean Biogeographic Information System is a webbased access point to information about the distribution and abundance of living species in the ocean. ICAN. International Coastal Atlas Network facilitates the development of digital atlases of the global coast, based on the principle of distributed, high-quality data and information. OceanDocs. Electronic repository of research & publications in marine science in digital form, including preprints, published articles, technical reports, working
IODE and Capacity Development Capacity development has been a cornerstone of the IODE since the programme s inception in 1961 The objective is to assist Member States to acquire the necessary capacity to manage marine data and information and become partners in the IODE network. Capacity development focuses on: - the principles of data and information management - promoting the use of standards across the IODE network resulting in interoperability between centres.
IODE Capacity Development Strategy ODIN Ocean Data and Information Network ODIN is based upon four elements: i. providing equipment ii. providing training iii. providing seed funding for operational activities for newly created data centres and marine libraries iv. working in a regional context to address common and national goals ODIN regional networks - ODINAFRICA. 25 African countries - ODINCARSA. Latin America and Caribbean - ODINCINDIO. Central Indian Ocean - ODINWESTPAC. Western Pacific region - ODINBlackSea. Black Sea region - ODINECET. European Countries in Economic Transition - ODIN-PIMRIS. Small Island Pacific States
IODE Training Tool: OceanTeacher OceanTeacher is a learning management system for marine data and information management Audience: - Data and information managers - Ocean researchers - University students Increasing focus on continuous professional development http://www.oceanteacher.org
OceanTeacher model Web-based training system that supports: - Classroom training (face-to-face) - Blended training (instructor-led, online) - Online self-learning Content freely and openly available - Creative commons licence Courses. Training courses using Moodle Digital Library. Online encyclopaedia about marine data management and marine information management Video Library. Lecture recordings
IODE Training Centre Oostende, Belgium; established 2005 Centre hosts: - International training centre - International conference centre - Expert centre Approx. 15 events held each year Support from Government of Flanders - Cooperation with Flanders Marine Institute
Global Classroom New initiative to establish regional and specialised training centres - Serve regional needs - Use local expertise - Provide courses in regionally relevant languages Centres will use the Ocean Teacher e-learning management system
OceanTeacher Global Classroom OceanTeacher Global Classroom model will blend traditional classroom-based training with distance learning, by using video streaming technology to enable multi-site classrooms Successful trialled in 2012-2 groups of students attended the same course - one group at IODE Project Office in Oostende, Belgium - another in INCOIS (International Training Centre for Operational Oceanography) in Hyderabad, India.
IODE summary IODE supports an international network of data centres and information centres IODE community has a range of expertise in data management and product delivery IODE encourages the free and open exchange of marine data through the implementation of the IOC Data Policy IODE is developing standards to improve data management and exchange IODE is developing the Ocean Data Portal to facilitate access to data IODE has substantial experience in capacity development
Oceanographic Data Management
Oceanographic data are important Marine data are fundamental to understand processes that control the environment Marine data are a key requirement for effective strategic decision making - Play an important role in promoting the development of economic activities Underpin many of our activities, such as: - navigation - sea transportation - fisheries - marine disaster mitigation - environmental monitoring
Oceanographic data are unique Marine data are expensive to collect Marine data are unique and unrepeatable - the environment is constantly changing Spatial and temporal coverage is quite sparse - Research vessels and moorings are small dots on a map Important to ensure that maximum benefit derived from data - Share data! capture once use many times
Data Stewardship The acquisition, processing, preservation, quality assurance and dissemination of marine data are key elements of curation and archiving data This is known as Data Stewardship A community approach Stewardship ownership
Components of Data Stewardship Key components include - Preserving detailed information about observed variables - Details of observation instruments used - Techniques and calibrations - Comprehensive metadata - Creation of products derived from archived data
End-to-end data management End-to-end data management system handles data from the point of collection, through processing and quality control, to archival and dissemination. End-to-end management of data facilitates the ability to integrate data from multiple sources and sensors (i.e. satellites, in situ, and model data)
Elements of end-to-end data systems 1. Standardized data collection - The lack of standardized data collection efforts can hamper long-term value of datasets - Data collection must be standardized to allow data sets from a variety of sources to be integrated.
Elements of end-to-end data systems 2. Common vocabularies - Use of common vocabularies is an important prerequisite towards consistency and interoperability. - Common vocabularies consist of lists of standardized terms that cover a broad spectrum of disciplines of relevance to the oceanographic and wider community. - Using standardized sets of terms reduces ambiguities and enables records to be interpreted by computers.
Elements of end-to-end data systems 3. Standard data formats - The selection and adoption of a small number of standardized data formats is essential to ensure effective data management - The use of just a few formats can enhance the ability to manage and preserve data over the long-term - netcdf (network Common Data Form) is emerging as a de facto standard
Elements of end-to-end data systems 4. Quality assurance and quality control (QA/QC) - Quality Assurance (QA) - procedures performed prior to instrument deployment to support the return of best possible quality data - Quality Control (QC) - procedures/processes applied to the data returned from the instrument - Standards for QA/QC should be well documented - The preservation of original values, even if they appear wrong, is important for possible future re-processing
Elements of end-to-end data systems 5. Data archive - To ensure the long-term preservation and dissemination of data, the data producers and data archives need to work together to generate the information needed to be able to understand and re-use data. This includes: Well-defined file naming conventions and format descriptions Related descriptive metadata Information to facilitate data dissemination
Role of Researchers Data derived from publicly-funded research should be available for public use The value of data increases if aggregated into collections and are available for re-use Researchers can work collaboratively with the wider community to develop a data management framework Example: World Ocean Database - Project established by IOC - World s largest collection of ocean profileplankton data - More than 9 million temperature and 3.6 million salinity profiles - Result of international exchange of oceanographic data
IODE Training Curriculum for Oceanographic Data Management
OceanTeacher Academy IODE identified a requirement for a regular cycle of standardized courses that are relevant for all regions. OceanTeacher Academy provides an annual teaching programme of courses related to oceanographic data and information management - Includes topics focussed on development of related products and services that contribute to the sustainable management of oceans and coastal areas Since 2005 OTA has organized over 50 courses for over 1200 students from 120 countries and taught by 20 lecturers.
OTA marine data management courses OTA offers a broad programme of courses including: - Fundamentals of Ocean Data Management - Ocean Data Products and Synthesis - Operational Oceanography - Marine GIS Applications - Marine and Coastal Atlas Development - University Accredited courses - Specialized courses - Ocean Data Portal installation - Marine Spatial Planning - MetOcean Modelling - Marine Metadata
Marine data paradigm (from marinedataliteracy.org)
Marine data formats There are many types of marine data formats Can be grouped (from OT Digital Library) library.oceanteacher.org
Self-describing formats These formats are used by the operational community - meteorological community (GRIB, BUFR) - satellite community (HDF) - ocean observing systems (netcdf). GRIB, HDF are used for gridded or raster data netcdf can support station data and gridded data BUFR is used operationally for real-time global exchange of weather and ocean observations on the GTS These formats contain extensive internal metadata providing user systems with all the information needed for both data discovery and practical usage.
netcdf format netcdf is a set of software libraries and self-describing, machineindependent data formats that support the creation, access, and sharing of array-oriented scientific data. (from UCAR) Community standard for sharing scientific data. Some features: - Self-Describing: includes metadata as well as data - Portable: data written on one platform can be read on other platforms - Direct-access: subset large datasets - Appendable: easily append to a netcdf - Networkable: client access to structured data on remote servers - Extensible: add new dimensions, variables, or attributes - Archivable: access to all earlier forms of netcdf data supported
Common Data Language - CDL ASCII representation of netcdf Example: profile data 3 parts - dimensions - variables - data netcdf SST_T_20091120T150000Z_VNSZ_FV01 { dimensions: // dimensions names are declared first TIME = 5; LONGITUDE = 3 ; LATITUDE = 2 ; variables: // variable types, names, dimensions and attributes double TIME(TIME) ; TIME:long_name = "time" ; TIME:units = "days since 1950-01-01 00:00:00Z" ; TIME:standard_name = "time" ; float LONGITUDE(LONGITUDE) ; LONGITUDE:long_name = "longitude" ; LONGITUDE:units = "degrees_east" ; LONGITUDE:standard_name = "longitude" ; float LATITUDE(LATITUDE) ; LATITUDE:long_name = "latitude" ; LATITUDE:units = "degrees_north" ; LATITUDE:standard_name = "latitude" ; float TEMP(TIME, LATITUDE, LONGITUDE) ; TEMP:long_name = "Water Temperature in degrees C" ; TEMP:units = "Celsius" ; TEMP:standard_name = "sea_water_temperature" ; TEMP:_FillValue = 99999 ; TEMP:valid_min = -2.0 ; TEMP:valid_max = 40 data: TIME = 0.5, 1.5, 2.5, 3.5, 4.5 ; LATITUDE = 54.2, 54.4, 54.6 ; LONGITUDE = 2.1, 2.5 ; TEMP = 34.5, 31.2, 23.7, 19.6, 35.8, 29.2, 24.4, 5.6, 7.2, 8.1, 18.6, 15.2, 13.1, 4.6, 3.7, 8.2, 9.7,34.2, 26.7, 28.7, 2.1, 3.4, 5.6, IODE Oceanographic 7.8, 9.0, Data Management 10.2, 11.2, 11.6, 11.7, 11.8 ; }
OPeNDAP & THREDDS Open-source Project for a Network Data Access Protocol OPeNDAP provides access to local data from remote locations regardless of local storage format Access data via URLs - URL = dataset - Clients get just the data they need, as they need them THREDDS is middleware to bridge the gap between data providers and data users THREDDS Data Server (TDS) provides catalogue, metadata, and data access services - Includes OPeNDAP remote data access protocols
THREDDS Server - Catalogue
THREDDS Server - Catalogue
Data management tools Ocean Data View (ODV) - analysis and visualization of oceanographic and other geo-referenced profile, time-series, trajectory or sequence data Saga GIS - Free and Open Source Software for visualization and analysis Integrated Data Viewer (IDV) - Visualization of operational data (including meteorological and oceanographic) - netcdf capability - 2D/3D display
Integrated Data Viewer (IDV) Integrated Data Viewer (IDV) is a Java based software framework for analyzing and visualizing environmental data that has been developed by Unidata Allows easy access to operational data - Model data - Observation data - Satellite and radar imagery More later. Sea level pressure and upper level jet