Canadian Astronomy Data Centre. Séverin Gaudet David Schade Canadian Astronomy Data Centre
|
|
|
- Della Wilkins
- 10 years ago
- Views:
Transcription
1 Canadian Astronomy Data Centre Séverin Gaudet David Schade Canadian Astronomy Data Centre
2 Data Activities in Astronomy Features of the astronomy data landscape Multi-wavelength datasets are increasingly important scientifically More large, homogeneous survey datasets are being produced Open data policies With pre-defined proprietary periods Good IT infrastructure Standards for data interoperability Common file formats Virtual Observatory project Immersed in a world of remarkable technical capabilities
3 The Data Benchmark Since 1990, the Hubble Space Telescope Science Archive has set the standard Mandated by NASA, supported by ESA and CSA Public data policy Accessibility Processed products In 2005, the papers based on archival data exceeded PI papers 3
4 The Data Benchmark 4
5 International Virtual Observatory Alliance
6 Data curation (paraphrased from Wikipedia) The process of identification and organisation of objects in a collection in order to further knowledge. Includes verification and additions to the existing metadata for objects. The process of examining, testing and selecting metadata to go in a collection database. Activities of a Data Centre
7 Activities of a Data Centre Data curation (paraphrased from Wikipedia) The process of identification and organisation of objects in a collection in order to further knowledge. Includes verification and additions to the existing metadata for objects. The process of examining, testing and selecting metadata to go in a collection database. Activities Data transfer and ingestion Data modelling and characterisation Data processing Data discovery Data distribution Data preservation
8 Canadian Astronomy Data Centre Created in 1986 following a CASCA resolution in 1985 Original mandate: to serve Hubble Space Telescope data Original name: the Canadian Space Astronomy Data Centre Supported in part by the Canadian Space Agency since mid 90s Now a national facility serving many of Canada s major telescopes
9 Canadian Astronomy Data Centre HST FUSE MOST CFHT Gemini N Gemini S JCMT CGPS MACHO Heterogeneous collection: multiple missions and facilities multiple wavelengths Pointed and survey observations Many different archive data models
10 Growth in CADC Collections Terabytes Year In 2007: > 80TB 8 major sources of data Network transfer only
11 Current Collection Archive Number of Files Number of GB BLAST 19 1 CFHT 1,946, ,950 DSS 7,268, FUSE 3,433,858 2,709 GEMINI 1,689,389 5,676 GPS 2, HST 12,785,565 59,314 IRIS 1,720 2 JCMT 740,652 1,364 MACHO 2,066,319 15,230 MOST Total 29,964, ,486 Two compressed copies on disk at HIA One compressed copy on tape at UVic
12 Activities Data Processing Data processing Removing instrument and telescope signatures Combining multiple images Useable science products Processing close to the data Examples: HST WFPC2 HST ACS CFHT Megacam
13 Activities Data Processing Data processing Removing instrument and telescope signatures Combining multiple images Useable science products Processing close to the data Examples: HST WFPC2 HST ACS CFHT Megacam
14 Activities Data Processing Data processing Removing instrument and telescope signatures Combining multiple images Useable science products Processing close to the data Examples: HST WFPC2 HST ACS CFHT Megacam
15 Activities Data Discovery
16 Activities Data Discovery
17 Activities Data Discovery
18 Activities Data Discovery
19 Inter-archive Links
20 Inter-archive Links
21 Inter-archive Links
22 Inter-archive Links
23 Data distribution Select and download Direct programmatic access Asynchronous retrieval Activities Data Distribution
24 Data distribution Select and download Direct programmatic access Asynchronous retrieval Activities Data Distribution
25 Data distribution Select and download Direct programmatic access Asynchronous retrieval Activities Data Distribution
26 Activities Data Distribution Data distribution Select and download Direct programmatic access Asynchronous retrieval Retrieve an HST ACS drizzle image curl -g -o J8OZ02010_DRZ.fits.gz anonproxy/getdata?archive=hst&file_id=j8oz02010_drz Retrieve extension 10 of a proprietary CFHT12K image curl -g -u username:password -o o_10.fits.gz archive=cfht&file_id=687344o&cutout=[10]
27 CADC Data Distribution In 2007: > 1 million files, > 58 TB Provided data and services to > 2500 distinct hosts worldwide Network distribution only Terabytes (log) Year
28 Users in 87 countries CADC Data Flows
29 International Virtual Observatory Alliance Mission: To facilitate the international coordination and collaboration to enable the international utilization of astronomical archives as an integrated and interoperating virtual observatory. Formed in 2002 A culture of data sharing
30 Developing: Data models Data query and access protocols Service discovery Tools Data centres bear the cost of implementation Benefit from world-wide development efforts International Virtual Observatory Alliance
31 VO Tools
32 VO Tools Octet
33
34
35
36 Lessons Learned Archives are not just a technological exercise: they are science projects! Multidisciplinary team necessary Enable the user to find relevant data Well described reliable data Good interfaces with data providers Good interfaces with user communities End-to-end data management is part of the whole mission design retro-fitting is not fun!
37 Change is driven by... Data providers (telescopes) User community Funding agencies and by... Enabling technologies Virtual Observatory Managing Change
38 The Changing Role Data Providers More services Data distribution Processing Data management Quality of service 24x7 availability Robust infrastructure Fail-over systems
39 The Changing Role User Community Improved access Anonymous access to public data Authenticated access to proprietary data Direct programmatic access An extension of the user s storage User defined processing Quality of service 24x7 availability Robust infrastructure Fail-over systems Community projects
40 The evolving data centre role The Virtual Observatory Continuous improvement of services to the user community Promotion and training Maintaining and fostering international collaborations Technology New missions (e.g. JWST, UVIT,...) Knowledge retention The last network mile Funding agencies Challenges
41
MAST: The Mikulski Archive for Space Telescopes
MAST: The Mikulski Archive for Space Telescopes Richard L. White Space Telescope Science Institute 2015 April 1, NRC Space Science Week/CBPSS A model for open access The NASA astrophysics data archives
How To Understand And Understand The Science Of Astronomy
Introduction to the VO [email protected] ESAVO ESA/ESAC Madrid, Spain The way Astronomy works Telescopes (ground- and space-based, covering the full electromagnetic spectrum) Observatories Instruments
CADC and CANFAR: Extending the role of the data centre. Séverin Gaudet Canadian Astronomy Data Centre
CADC and CANFAR: Extending the role of the data centre Séverin Gaudet Canadian Astronomy Data Centre February 2012 Canadian Astronomy Data Centre Heterogeneous collection: Multiple missions, facilities
Datamanagement at the European Southern Observatory: Strategies and Challenges. Michael Sterzik, ESO Datamanagement and Operations Division
Datamanagement at the European Southern Observatory: Strategies and Challenges Michael Sterzik, ESO Datamanagement and Operations Division European Southern Observatory - builds and operates state-of-the-art
The Virtual Observatory: What is it and how can it help me? Enrique Solano LAEFF / INTA Spanish Virtual Observatory
The Virtual Observatory: What is it and how can it help me? Enrique Solano LAEFF / INTA Spanish Virtual Observatory Astronomy in the XXI century The Internet revolution (the dot com boom ) has transformed
AST 4723 Lab 3: Data Archives and Image Viewers
AST 4723 Lab 3: Data Archives and Image Viewers Objective: The purpose of the lab this week is for you to learn how to acquire images from online archives, display images using a standard image viewer
ASKAP Science Data Archive: Users and Requirements CSIRO ASTRONOMY AND SPACE SCIENCE (CASS)
ASKAP Science Data Archive: Users and Requirements CSIRO ASTRONOMY AND SPACE SCIENCE (CASS) Jessica Chapman, Data Workshop March 2013 ASKAP Science Data Archive Talk outline Data flow in brief Some radio
irods in complying with Public Research Policy
irods User Group 2015 irods in complying with Public Research Policy Vic Cornell Senior Storage Consultant Overview Compliance overview UK examples Imperial College MedBio Requirements Architecture irods
Data Management at UT
Data Management at UT Maria Esteva, TACC, [email protected] Colleen Lyon, UT Libraries, [email protected] Angela Newell, ITS, [email protected] What is data management? systematic organization
Organization of VizieR's Catalogs Archival
Organization of VizieR's Catalogs Archival Organization of VizieR's Catalogs Archival Table of Contents Foreword...2 Environment applied to VizieR archives...3 The archive... 3 The producer...3 The user...3
EA-ARC ALMA ARCHIVE DATA USER GUIDEBOOK
EA-ARC ALMA ARCHIVE DATA USER GUIDEBOOK Prepared by James O. Chibueze NAOJ Chile Observatory Purpose of this handbook: This handbook is aimed at providing fundamental information on how and where to access
on the establishment of a Brazilian Science Data Center (BSDC) General Guidelines
on the establishment of a Brazilian Science Data Center (BSDC) General Guidelines 1 Introduction Since the entrance of Brazil in ICRANet a variety of projects have been started to be developed a) in the
The astronomical Virtual Observatory : lessons learnt, looking forward. Françoise Genova - Forum VO-PDC d après ADASS XXI, Paris, nov.
The astronomical Virtual Observatory : lessons learnt, looking forward Examples taken from the European view, but other projects have followed similar paths The VO aim Enable seamless access to the wealth
RESEARCH DATA MANAGEMENT POLICY
Document Title Version 1.1 Document Review Date March 2016 Document Owner Revision Timetable / Process RESEARCH DATA MANAGEMENT POLICY RESEARCH DATA MANAGEMENT POLICY Director of the Research Office Regular
Redefining Microsoft SQL Server Data Management. PAS Specification
Redefining Microsoft SQL Server Data Management APRIL Actifio 11, 2013 PAS Specification Table of Contents Introduction.... 3 Background.... 3 Virtualizing Microsoft SQL Server Data Management.... 4 Virtualizing
Data Lab System Architecture
Data Lab System Architecture Data Lab Context Data Lab Architecture Astronomer s Desktop Web Page Cmdline Tools Legacy Apps User Code User Mgmt Data Lab Ops Monitoring Presentation Layer Authentication
Survey of Canadian and International Data Management Initiatives. By Diego Argáez and Kathleen Shearer
Survey of Canadian and International Data Management Initiatives By Diego Argáez and Kathleen Shearer on behalf of the CARL Data Management Working Group (Working paper) April 28, 2008 Introduction Today,
THE CCLRC DATA PORTAL
THE CCLRC DATA PORTAL Glen Drinkwater, Shoaib Sufi CCLRC Daresbury Laboratory, Daresbury, Warrington, Cheshire, WA4 4AD, UK. E-mail: [email protected], [email protected] Abstract: The project aims
The LSST Data management and French computing activities. Dominique Fouchez on behalf of the IN2P3 Computing Team. LSST France April 8th,2015
The LSST Data management and French computing activities Dominique Fouchez on behalf of the IN2P3 Computing Team LSST France April 8th,2015 OSG All Hands SLAC April 7-9, 2014 1 The LSST Data management
Data Management using irods
Data Management using irods Fundamentals of Data Management September 2014 Albert Heyrovsky Applications Developer, EPCC [email protected] 2 Course outline Why talk about irods? What is irods?
An IDL for Web Services
An IDL for Web Services Interface definitions are needed to allow clients to communicate with web services Interface definitions need to be provided as part of a more general web service description Web
Redefining Microsoft Exchange Data Management
Redefining Microsoft Exchange Data Management FEBBRUARY, 2013 Actifio PAS Specification Table of Contents Introduction.... 3 Background.... 3 Virtualizing Microsoft Exchange Data Management.... 3 Virtualizing
Long Term Preservation of Earth Observation Data
Long Term Preservation of Earth Observation Data QA4EO Workshop RAL, October 18-20 th 2011 Mirko Albani and Bojan Bojkov* (ESA/ESRIN) Page 1 Outline Earth Observation data preservation: the need and the
NASA's Postdoctoral Fellowship Programs
NASA's Postdoctoral Fellowship Programs Einstein Fellowships Dr. Charles A. Beichman & Dr. Dawn M. Gelino NASA Exoplanet Science Institute Dr. Ron Allen Space Telescope Science Institute Dr. Andrea Prestwich
Introduction to NetApp Infinite Volume
Technical Report Introduction to NetApp Infinite Volume Sandra Moulton, Reena Gupta, NetApp April 2013 TR-4037 Summary This document provides an overview of NetApp Infinite Volume, a new innovation in
Planning and Scheduling Software for the Hobby Eberly Telescope. Niall I. Gaffney Hobby Eberly Telescope Mark E. Cornell McDonald Observatory
Planning and Scheduling Software for the Hobby Eberly Telescope Niall I. Gaffney Hobby Eberly Telescope Mark E. Cornell McDonald Observatory What is the HET project?! Joint Project: University of Texas
OpenAIRE Research Data Management Briefing paper
OpenAIRE Research Data Management Briefing paper Understanding Research Data Management February 2016 H2020-EINFRA-2014-1 Topic: e-infrastructure for Open Access Research & Innovation action Grant Agreement
EUROPEAN COMMISSION Directorate-General for Research & Innovation. Guidelines on Data Management in Horizon 2020
EUROPEAN COMMISSION Directorate-General for Research & Innovation Guidelines on Data Management in Horizon 2020 Version 2.0 30 October 2015 1 Introduction In Horizon 2020 a limited and flexible pilot action
IoT-03-2017 R&I on IoT integration and platforms INTERNET OF THINGS FOCUS AREA
HORIZON 2020 WP 2016-17 IoT-03-2017 R&I on IoT integration and platforms INTERNET OF THINGS DG CONNECT European Commission Internet of Things As enabler of a future hyper-connected society, the Internet
Massive Labeled Solar Image Data Benchmarks for Automated Feature Recognition
Massive Labeled Solar Image Data Benchmarks for Automated Feature Recognition Michael A. Schuh1, Rafal A. Angryk2 1 Montana State University, Bozeman, MT 2 Georgia State University, Atlanta, GA Introduction
Data Mining Challenges and Opportunities in Astronomy
Data Mining Challenges and Opportunities in Astronomy S. G. Djorgovski (Caltech) With special thanks to R. Brunner, A. Szalay, A. Mahabal, et al. The Punchline: Astronomy has become an immensely datarich
National Science Foundation Office of Inspector General 4201 Wilson Boulevard, Suite I-1135, Arlington, Virginia 22230
National Science Foundation Office of Inspector General 4201 Wilson Boulevard, Suite I-1135, Arlington, Virginia 22230 MEMORANDUM DATE: March 31, 2015 TO: FROM: SUBJECT: Jeffery M. Lupis, Director Division
H2020 Guidelines on Open Data and Data Management Plan
H2020 Guidelines on Open Data and Data Management Plan CRR Centro Risorse per la Ricerca Multimediale Why? Open scientific research data should be easily discoverable, accessible, assessable, intelligible,
A Novel Cloud Based Elastic Framework for Big Data Preprocessing
School of Systems Engineering A Novel Cloud Based Elastic Framework for Big Data Preprocessing Omer Dawelbeit and Rachel McCrindle October 21, 2014 University of Reading 2008 www.reading.ac.uk Overview
Data transport in radio astronomy. Arpad Szomoru, JIVE
Data transport in radio astronomy Arpad Szomoru, JIVE Some acronyms EVN: European VLBI Network Consortium of radio telescopes Involving 14 different organizations around the world: Europe, China, Puerto
Data-Intensive Science and Scientific Data Infrastructure
Data-Intensive Science and Scientific Data Infrastructure Russ Rew, UCAR Unidata ICTP Advanced School on High Performance and Grid Computing 13 April 2011 Overview Data-intensive science Publishing scientific
EUDAT. Towards a pan-european Collaborative Data Infrastructure. Willem Elbers
EUDAT Towards a pan-european Collaborative Data Infrastructure Willem Elbers EUDAT / MPI-TLA Focus meeting: Data repositories SURF, Utrecht March 3, 2014 Outline EUDAT project EUDAT services Summary and
How To Write A Blog Post On Globus
Globus Software as a Service data publication and discovery Kyle Chard, University of Chicago Computation Institute, [email protected] Jim Pruyne, University of Chicago Computation Institute, [email protected]
Hitachi NAS Platform and Hitachi Content Platform with ESRI Image
W H I T E P A P E R Hitachi NAS Platform and Hitachi Content Platform with ESRI Image Aciduisismodo Extension to ArcGIS Dolore Server Eolore for Dionseq Geographic Uatummy Information Odolorem Systems
August, 2000 LIGO-G000313-00-D. http://www.ligo.caltech.edu/~smarka/sn/
LIGO-G000313-00-D Proposal to the LSC for Entry of LIGO into the Supernova Nova Early Warning System (SNEWS) and Prototype Development of Real-Time LIGO Supernova Alert 1 Barry Barish, Kenneth Ganezer(CSUDH),
Geospatial Data Archiving
Library of Congress National Digital Stewardship Alliance Geospatial Content Work Group http://www.digitalpreservation.gov/ndsa/working_groups/content.html Geospatial Data Archiving Quick Reference for
Data Archiving for Littelfuse Paved the Way for One Day SAP ERP ECC 6.0 Upgrade
Data Archiving for Littelfuse Paved the Way for One Day SAP ERP ECC 6.0 Upgrade Industry: High-tech Manufacturing Geography: USA Employee Size: 6,550 Revenue Range: $500 Million - $1 Billion The Client
Privacy and Security within an Interoperable EHR
1 Privacy and Security within an Interoperable EHR Stan Ratajczak Director Privacy and Security Solutions Architecture Group November 30, 2005 Electronic Health Information and Privacy Conference Ottawa
Breaking Down the Silos: A 21st Century Approach to Information Governance. May 2015
Breaking Down the Silos: A 21st Century Approach to Information Governance May 2015 Introduction With the spotlight on data breaches and privacy, organizations are increasing their focus on information
Digital preservation a European perspective
Digital preservation a European perspective Pat Manson Head of Unit European Commission DG Information Society and Media Cultural Heritage and Technology Enhanced Learning Outline The digital preservation
NASA s Big Data Challenges in Climate Science
NASA s Big Data Challenges in Climate Science Tsengdar Lee, Ph.D. High-end Computing Program Manager NASA Headquarters Presented at IEEE Big Data 2014 Workshop October 29, 2014 1 2 7-km GEOS-5 Nature Run
ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE. October 2013
ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE October 2013 Introduction As sequencing technologies continue to evolve and genomic data makes its way into clinical use and
XenData Video Edition. Product Brief:
XenData Video Edition Product Brief: The Video Edition of XenData Archive Series software manages one or more automated data tape libraries on a single Windows 2003 server to create a cost effective digital
Vodacom Managed Hosted Backups
Vodacom Managed Hosted Backups Robust Data Protection for your Business Critical Data Enterprise class Backup and Recovery and Data Management on Diverse Platforms Vodacom s Managed Hosted Backup offers
Research Data Management in Horizon 2020
Research Data Management in Horizon 2020 Dr. Fieke Schoots, UBL 11 / 6 / 2015 From : Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020 [v.1.0, 11/12/2013] Open access
Exploitation of ISS scientific data
Cooperative ISS Research data Conservation and Exploitation Exploitation of ISS scientific data Luigi Carotenuto Telespazio s.p.a. Copernicus Big Data Workshop March 13-14 2014 European Commission Brussels
SURFsara Data Services
SURFsara Data Services SUPPORTING DATA-INTENSIVE SCIENCES Mark van de Sanden The world of the many Many different users (well organised (international) user communities, research groups, universities,
LIBER Case Study: University of Oxford Research Data Management Infrastructure
LIBER Case Study: University of Oxford Research Data Management Infrastructure AuthorS: Dr James A. J. Wilson, University of Oxford, [email protected] Keywords: generic, institutional, software
GEOG 482/582 : GIS Data Management. Lesson 10: Enterprise GIS Data Management Strategies GEOG 482/582 / My Course / University of Washington
GEOG 482/582 : GIS Data Management Lesson 10: Enterprise GIS Data Management Strategies Overview Learning Objective Questions: 1. What are challenges for multi-user database environments? 2. What is Enterprise
Virginia Commonwealth University Rice Rivers Center Data Management Plan
Virginia Commonwealth University Rice Rivers Center Data Management Plan Table of Contents Objectives... 2 VCU Rice Rivers Center Research Protocol... 2 VCU Rice Rivers Center Data Management Plan... 3
CYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21)
CYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21) Goal Develop and deploy comprehensive, integrated, sustainable, and secure cyberinfrastructure (CI) to accelerate research
Canadian National Research Data Repository Service. CC and CARL Partnership for a national platform for Research Data Management
Research Data Management Canadian National Research Data Repository Service Progress Report, June 2016 As their digital datasets grow, researchers across all fields of inquiry are struggling to manage
Local Loading. The OCUL, Scholars Portal, and Publisher Relationship
Local Loading Scholars)Portal)has)successfully)maintained)relationships)with)publishers)for)over)a)decade)and)continues) to)attract)new)publishers)that)recognize)both)the)competitive)advantage)of)perpetual)access)through)
Workload Characterization and Analysis of Storage and Bandwidth Needs of LEAD Workspace
Workload Characterization and Analysis of Storage and Bandwidth Needs of LEAD Workspace Beth Plale Indiana University [email protected] LEAD TR 001, V3.0 V3.0 dated January 24, 2007 V2.0 dated August
TERRITORY RECORDS OFFICE BUSINESS SYSTEMS AND DIGITAL RECORDKEEPING FUNCTIONALITY ASSESSMENT TOOL
TERRITORY RECORDS OFFICE BUSINESS SYSTEMS AND DIGITAL RECORDKEEPING FUNCTIONALITY ASSESSMENT TOOL INTRODUCTION WHAT IS A RECORD? AS ISO 15489-2002 Records Management defines a record as information created,
Arkivum's Digital Archive Managed Service
ArkivumLimited R21 Langley Park Way Chippenham Wiltshire SN15 1GE UK +44 1249 405060 [email protected] @Arkivum arkivum.com Arkivum's Digital Archive Managed Service Service Description 1 / 13 Table of
SAR Archive and Community Support Activities at UNAVCO
SAR Archive and Community Support Activities at UNAVCO Scott Baker 1, Chris Crosby 1, Charles Meertens 1, Eric Fielding 2, Gwen Bryson 3, Brian Buechler 3, Jeremy Nicoll 3, Chaitanya Baru 4 1 UNAVCO, Boulder,
