Climate and forecast (CF) metadata convention



Similar documents
PART 1. Representations of atmospheric phenomena

Scientific Data Management and Dissemination

Integrated Climate Data Center How to use our data center? Integrated Climate Data Center - Remon Sadikni - remon.sadikni@zmaw.de

Structure? Integrated Climate Data Center How to use the ICDC? Tools? Data Formats? User

ANNEX C to ADDITIONAL MILITARY LAYERS GRIDDED SEDIMENT ENVIRONMENT SEABED BEACH PRODUCT SPECIFICATION Version 1.0, 26th July 2005

A Project to Create Bias-Corrected Marine Climate Observations from ICOADS

Norwegian Satellite Earth Observation Database for Marine and Polar Research USE CASES

NOMADS. Jordan Alpert, Jun Wang NCEP/NWS. Jordan C. Alpert where the nation s climate and weather services begin

Long-term Observations of the Convective Boundary Layer (CBL) and Shallow cumulus Clouds using Cloud Radar at the SGP ARM Climate Research Facility

AmeriFlux Site and Data Exploration System

NCDC s SATELLITE DATA, PRODUCTS, and SERVICES

CCAM GUI INSTRUCTION MANUAL FOR TAPM USERS (v1004t) Marcus Thatcher CSIRO Marine and Atmospheric Research

Application of Numerical Weather Prediction Models for Drought Monitoring. Gregor Gregorič Jožef Roškar Environmental Agency of Slovenia

Tools for Viewing and Quality Checking ARM Data

Guy Carpenter Asia-Pacific Climate Impact Centre, School of energy and Environment, City University of Hong Kong

Basic Climatological Station Metadata Current status. Metadata compiled: 30 JAN Synoptic Network, Reference Climate Stations

Primary author: Kaspar, Frank (DWD - Deutscher Wetterdienst), Frank.Kaspar@dwd.de

Temporal variation in snow cover over sea ice in Antarctica using AMSR-E data product

Why aren t climate models getting better? Bjorn Stevens, UCLA

Limitations of Equilibrium Or: What if τ LS τ adj?

Nevada NSF EPSCoR Track 1 Data Management Plan

Interoperable Documentation

The Arctic Observing Network and its Data Management Challenges Florence Fetterer (NSIDC/CIRES/CU), James A. Moore (NCAR/EOL), and the CADIS team

GOAPP Data Management Policy

Latin American and Caribbean Flood and Drought Monitor Tutorial Last Updated: November 2014

WORLD METEOROLOGICAL ORGANIZATION. Introduction to. GRIB Edition1 and GRIB Edition 2

Selected Ac)vi)es Overseen by the WDAC

Predicting daily incoming solar energy from weather data

Data-Intensive Science and Scientific Data Infrastructure

An Introduction to the MTG-IRS Mission

Ocean Level-3 Standard Mapped Image Products June 4, 2010

NRM Climate Projections

13.2 THE INTEGRATED DATA VIEWER A WEB-ENABLED APPLICATION FOR SCIENTIFIC ANALYSIS AND VISUALIZATION

Analysis of Climatic and Environmental Changes Using CLEARS Web-GIS Information-Computational System: Siberia Case Study

Inference and Analysis of Climate Models via Bayesian Approaches

Addendum to the CMIP5 Experiment Design Document: A compendium of relevant s sent to the modeling groups

Improving Representation of Turbulence and Clouds In Coarse-Grid CRMs

Indian Ocean and Monsoon

Obtaining and Processing MODIS Data

Panoply I N T E R D Y N A M I K. A Tool for Visualizing NetCDF-Formatted Model Output

Improvement in the Assessment of SIRS Broadband Longwave Radiation Data Quality

"CLIMATE DATA OPERATORS" AS A USER-FRIENDLY PROCESSING TOOL FOR CMSAF'S SATELLITE-DERIVED CLIMATE MONITORING PRODUCTS

DATA MANAGEMENT (DM) - (e.g. codes, monitoring) software offered to WMO Members for exchange

StormSurgeViz: A Visualization and Analysis Application for Distributed ADCIRC-based Coastal Storm Surge, Inundation, and Wave Modeling

EN DK NA:2007

Evaluating GCM clouds using instrument simulators

Environmental Data Management:

Core Components Data Type Catalogue Version October 2011

EDEXCEL NATIONAL CERTIFICATE/DIPLOMA MECHANICAL PRINCIPLES AND APPLICATIONS NQF LEVEL 3 OUTCOME 1 - LOADING SYSTEMS

Introduction to IODE Data Management. Greg Reed Past Co-Chair IODE

Can latent heat release have a negative effect on polar low intensity?

WORLD WEATHER ONLINE

Evalua&ng Downdra/ Parameteriza&ons with High Resolu&on CRM Data

Interactive Data Visualization with Focus on Climate Research

Using Cloud-Resolving Model Simulations of Deep Convection to Inform Cloud Parameterizations in Large-Scale Models

Open Data Policy 25 th November 2014 Clare Hubbard Data Policy Manager

Advancement of the NOMADS for Observational Data and Model Intercomparisons and the Establishment of a NCDC NOMADS Team and HelpDesk

Performance Metrics for Climate Models: WDAC advancements towards routine modeling benchmarks

Proposal for a Discovery-level WMO Metadata Standard

Research Objective 4: Develop improved parameterizations of boundary-layer clouds and turbulence for use in MMFs and GCRMs

When the fluid velocity is zero, called the hydrostatic condition, the pressure variation is due only to the weight of the fluid.

HYCOM Data Management -- opportunities & planning --

Description of Scatterometer Data Products

The NetCDF Users Guide

Description of zero-buoyancy entraining plume model

Satellite Products and Dissemination: Visualization and Data Access

CALIPSO, CloudSat, CERES, and MODIS Merged Data Product

Visualization of Climate Data in R. By: Mehrdad Nezamdoost April 2013

What the Heck are Low-Cloud Feedbacks? Takanobu Yamaguchi Rachel R. McCrary Anna B. Harper

Using Python in Climate and Meteorology

Outcomes of the CDS Technical Infrastructure Workshop

Data Management Handbook

Chapter 3: Weather Map. Weather Maps. The Station Model. Weather Map on 7/7/2005 4/29/2011

Comment on "Observational and model evidence for positive low-level cloud feedback"

DATA VALIDATION, PROCESSING, AND REPORTING

Solar Flux and Flux Density. Lecture 3: Global Energy Cycle. Solar Energy Incident On the Earth. Solar Flux Density Reaching Earth

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

Diurnal Cycle of Convection at the ARM SGP Site: Role of Large-Scale Forcing, Surface Fluxes, and Convective Inhibition

The PRISM software framework and the OASIS coupler

Development and Use of Marine XML within the Australian Oceanographic Data Centre to Encapsulate Marine Data. Abstract

Data Quality Assurance and Control Methods for Weather Observing Networks. Cindy Luttrell University of Oklahoma Oklahoma Mesonet

Chapter 2: Solar Radiation and Seasons

Financing Community Wind

MI oceanographic data

SEA START Climate Change Analysis Tool v1.1

Unified Lecture # 4 Vectors

ECMWF Aerosol and Cloud Detection Software. User Guide. version /01/2015. Reima Eresmaa ECMWF

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler

Iris Sample Data Set. Basic Visualization Techniques: Charts, Graphs and Maps. Summary Statistics. Frequency and Mode

Studying Topography, Orographic Rainfall, and Ecosystems (STORE)

Traffic Management Systems with Air Quality Monitoring Feedback. Phil Govier City & County of Swansea

Levels of Archival Stewardship at the NOAA National Oceanographic Data Center: A Conceptual Model 1

Climate and Weather. This document explains where we obtain weather and climate data and how we incorporate it into metrics:

Data Storage STAT 133. Gaston Sanchez. Department of Statistics, UC Berkeley

Arithmetic Coding: Introduction

COM CO P 5318 Da t Da a t Explora Explor t a ion and Analysis y Chapte Chapt r e 3

GIS Initiative: Developing an atmospheric data model for GIS. Olga Wilhelmi (ESIG), Jennifer Boehnert (RAP/ESIG) and Terri Betancourt (RAP)

Real-time Ocean Forecasting Needs at NCEP National Weather Service

An XML Based Marine Data Management Framework

Archiving of Simulations within the NERC Data Management Framework: BADC Policy and Guidelines.

Transcription:

Climate and forecast (CF) metadata convention Jonathan Gregory CGAM, Department of Meteorology, University of Reading, UK Met Office Hadley Centre, UK CF has been developed by Brian Eaton, Jonathan Gregory, Bob Drach, Karl Taylor and Steve Hankin.

Goals Locate data in space time and as a function of other independent variables, to facilitate processing and graphics Identify data sufficiently to enable users of data from different sources to decide what is comparable, and to distinguish variables in archives Framed as a netcdf standard, but most CF ideas relate to metadata design in general and not specifically to netcdf, and hence can be contained in other formats such as XML

Basis of design Intended for use with climate and forecast data atmosphere, surface and ocean model-generated data and comparable observational datasets Backward-compatible with COARDS so that applications which understand CF can also process COARDS datasets CF datasets will not break applications based on COARDS Hence COARDS is a subset of CF where COARDS is adequate, CF does not provide an alternative extensions to COARDS are all optional and provide new functionality But some COARDS features are deprecated in CF

General principles Data should be self-describing no external tables needed to interpret it Conventions have been developed only for things we know we need Avoid being too onerous for data-writers and data-readers Metadata readable by humans as well as easily parsed by programs Minimise redundancy and possibilities for silly mistakes Avoid multiplicity of attributes NetCDF-specific principles Information is generally provided per-variable, not per-file Nothing depends on the names of variables (except the Unidata coordinate variable convention).

Origin of the data CF provides some basic discovery metadata in global attributes title What s in the file *institution Where it was produced *source How it was produced e.g. model version, instrument type history Audit trail of processing operations *references Pointers to publications or web documentation *comment Miscellaneous *These ones can also be attributes of variables containing data

Description of the data units is mandatory for all variables containing data other than dimensionless numbers. The units do not identify the physical quantity. units can be udunits strings e.g. 1, degc, Pa, mbar, W m-2, kg/m2/s, mm day^-1 or COARDS specials layer, level, sigma level. Udunits doesn t support ppm, psu, db, Sv. standard name identifies the quantity. Units must be consistent with standard name and any statistical processing e.g. variance. Standard name does not include coordinate or processing information. long name is not standardised. ancillary variables is a pointer to variables providing metadata about the individual data values e.g. standard error or data quality information. Numeric data variables may have FillValue, missing value, valid max, valid min, valid range. CF deprecates missing value. Variables containing flag values need flag values and flag meanings to make them self-describing.

Standard name table Currently 366 entries. Expanded on request. Many expected from PRISM. Machineable XML or human-readable HTML with help text and guidelines. Units GRIB PCMDI Standard name K 13 theta air potential temperature 1 71 E164 clt cloud area fraction kg m-2 79 large scale snowfall amount kg m-2 s-1 large scale snowfall flux m s-1 lwe large scale snowfall rate K Pa s-1 mpwapta product of omega and air temperature 1 region 1 91 sic sea ice area fraction 1e-3 88 so sea water salinity W m-2 rlds surface downwelling longwave flux W m-2 rls surface net downward longwave flux

Dimensions and coordinates Dimensions establish the index space of data variables e.g. temperature(lat,lon) has an element temperature(46,0). Coordinates are the independent variables, data the dependent variables e.g. temperature is 252.2 K at 0 E and 25.0 S. COARDS requires CDL dimension order tzyx. CF recommends it. point time 11:00 12:00 13:00 14:00 15:00 16:00 latitude longitude height=1.5 m point is a dimension without coordinates height is a coordinate without dimension or of size one

Variables containing coordinate data The Unidata netcdf standard associates one coordinate variable with each dimension, by identity of name e.g. lat(lat). The coordinate variable distinguishes the elements along the axis its values must be distinct. By convention they are strictly monotonic. The CF standard introduces Scalar coordinate variables: zero-dimensional variables or single strings. Auxiliary coordinate variables: have any subset of the dimensions of the data variable, not necessarily monotonic. Associated with the data variable through the coordinates attribute.

sigma(sigma) :positive="down" 0.50 0.85 0.95 1.00 :axis="z" :formula_terms="sigma: sigma ps: ps" :coordinates="model_level_number" model_level_number(sigma) 4 3 2 1 longitude(longitude) :units="degrees_east" :axis="x" latitude(latitude) :units="degrees_north" :axis="y"

lon(point) lat(point) 11:00 12:00 13:00 14:00 15:00 16:00 time name(point,length) Hamburg Livermore 10 122 75 1 54 38 40 51 Princeton Reading float air_temperature(time,point) :coordinates="lat lon name" y(y) x(x) float air_pressure_at_sea_level(y,x) :coordinates="lat lon" :grid_mapping="mappinginfo" float lat(y,x)

Time Time (year, month, day, hour, minute, second) is encoded with units time unit since reference time. The encoding depends on the calendar, which defines the permitted values of (year, month, day). e.g. 2003-8-31 exists in the standard calendar, but not in the 360-day calendar. 36583.625 in standard 2000-2-29 15:00:00 = days since 1900-1-1 36058.625 in 360-day COARDS supports only the standard calendar. Beware year and month units! year=365.242 days, month=year/12.

If not default, the calendar is indicated by the calendar attribute of the time coordinate variable. Possibilities are gregorian or standard (the default) proleptic gregorian noleap or 365 day all leap or 366 day 360 day julian Also none, for perpetual season. Arbitrary calendars can be defined e.g. for palæoclimate in terms of month lengths. Unfortunately udunits supports only the standard calendar at present. CDMS can handle others. Perhaps ncgen and ncdump will be extended to support formatted time : data: time=2000-2-29 15:00:00, 2000-2-29 16:00:00,... instead of time=36583.625, 36583.667,...

Bounds It is often necessary to know the extent of a cell as well as the grid point location e.g. area of lon lat boxes, thickness of a vertical layer, length of a time-mean period. Bounds can be particularly important for size-one coordinate variables. Boundary variables can be attached to any variable containing coordinate data. They have an extra trailing dimension and assume all attributes of their parent. They can be used to test contiguousness of cells. b(i,0) b(i+1,0) p(j+1,i) p(j+1,i+1) p(t) p(i) p(i+1) b(i,1) b(i+1,1) b(j+1,i,0) b(j,i,3) b(j,i,2) b(j,i+1,3) p(j,i) p(j,i+1) b(s,v) p(s) b(r,4) b(r,3) b(s,v+1) b(r,5) p(r) b(r,2) b(j,i,0) b(j,i,1) b(j,i+1,0) b(r,0) b(r,1)

Cell methods The cell methods attribute of a data variable indicates how variation within the cells is represented. The method may be different for each axis. Default: point for intensive quantities e.g. temperature sum for extensive quantities e.g. precipitation amount in time 10 5 0 5 10 7 Sept 8 9 10 11 12 13 14 15 16 17 :cell_methods="time: maximum" 60S 30S 0 30N 60N 10 0 10 20 30 :cell_methods="longitude: mean"

Climatological statistics Climatological statistics may be derived from Corresponding portions of the annual cycle in a set of years, e.g. average January temperatures in the climatology of 1961 1991. Corresponding portions of a range of days e.g. the average diurnal cycle in April 1997. Both concepts at once. COARDS supports the first kind with the year 0 convention. CF deprecates this because (a) it doesn t indicate the range of years, (b) there is no convention for how dates in year 0 should be encoded, (c) year 0 may be a real year in non-standard calendars. Climatological statistics have a time coordinate with climatological bounds. An interval of climatological time represents a set of subintervals which are not necessarily contiguous.

The cell methods indicates the interpretation. NB bounds are shown translated into (year, month, day, etc.) Average winter-minimum temperature for 1961 1991 cell methods="time: minimum within years time: mean over years" climatology bounds=1961-12-1, 1991-3-1 Average temperature for 1 2 a.m. in April 1997 cell methods="time: mean within days time: mean over days" climatology bounds=1997-4-1 1:00, 1997-4-30 2:00 Monthly-maximum daily precipitation total for June 2000 cell methods="time: sum within days time: maximum over days" climatology bounds=2000-6-1 6:00:00, 2000-7-1 6:00:00

Reduction of dataset size Packing (lossy): packed value = (unpacked value-add offset)/scale factor byte short int float double Gathering: 9 11 17 19 20 21 lat(lat) 17 19 20 21 9 11 int land(land) float data(land) :compress="lat lon" lon(lon) float data(lat,lon) Built-in per-variable compression in netcdf would be useful!

Evolution of the standard Non-beta CF-1.0 about to be released. CF and the standard name table will continue to evolve. Work will immediately begin on 1.1, to include: statistical operations which combine data variables e.g. covariance. handling of forecast and analysis time. more grid mappings. support for spectral representation. Currently no recognition of: related horizontal grids e.g. Arakawa B-grid, C-grid. relations between vector and tensor components. Do we need these? CDML and NcML introduce some features and concepts not present in CF.

CF in the real world CF has a web site http://www.cgd.ucar.edu/cms/eaton/cf-metadata and an email list for requesting and discussing extensions. Adopted by: PCMDI and *MIP, PRISM, ESMF, NCAR, Hadley Centre, GFDL, various EU projects. Supporting software: CF-checker. Planned PCMDI output routine. VCDAT/CDMS. NCO? Ferret? Simple utilities? CF has been developed by volunteers. It may become too much for us.