Big Data, Big Challenges and new Paradigm for the Gaia Archive

Size: px
Start display at page:

Download "Big Data, Big Challenges and new Paradigm for the Gaia Archive"

Transcription

1 Big Data, Big Challenges and new Paradigm for the Gaia Archive Christophe Arviset Head of ESAC Science Data Centre 15/03/2016 Issue/Revision: 1.0 Reference: Gaia Archive's big data challenges Status: Issued

2 ESAC Science Data Centre The Digital Library of the Universe At ESA s European Space Astronomy Centre Near Madrid, Spain Science Archives from >15 space missions: Astronomy, Planetary, Solar System, From all phases (dev, ops, post-ops, legacy) Calibrated processed data, high level data products, raw data, Different Users: Scientific Community (public access) PI team and observers (controlled access) Science Operations Team (privileged access) Common Software Architecture and Look and Feel C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 2

3 Gaia Satellite and Data Overview 1. ESA Corner Stone missions, launched 19/12/ Stereoscopic Census of the Galaxy over 5 years a. 1-2 billions sources with unprecedented accuracy b. 100TB downlink c. Up to 1PB calibrated data telescope transits astrometric observations 150 x 10 6 Spectra 3. Big data processing challenge as well! a. (outside the scope of this presentation) 4. 1 st public release of Gaia catalogue in summer ~1 new release per year 6. Final catalogue ~2022 C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 3

4 Gaia is definitely a major astronomy Big Data project Volume 1PB of data in total, not really big data Velocity Massively complex data processing challenges, FLOP Variety Source catalogue, spectras, telescope transits Veracity Astrometry, photometry and spectroscopy with high quality Value Believed to revolutionize astronomy Most accurate, consistent, complete, and challenging astrometric data set to date C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 4

5 Standard Archives Architecture Command line http science archive Browser GUI http VO Apps SAMP SIAP, SSAP, ftp VO services Database Data Repository User Disk C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 5

6 ESAC Archives Volume evolution All data stored on hard disks and distributed through Internet Euclid will add up to ~150 PBs by 2023 C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 6

7 Gaia Archive current content Simulations GUMS Milky Way: 2x10 9 rows Large Magellanic Cloud: 7.5x10 6 rows Small Magellanic Cloud: 1.2x10 6 rows Galaxies: 38x10 6 rows Quasars: 10 6 rows GOG 1.8x10 9 rows External Catalogues IGSL (Initial Gaia Source List) 1.2x10 9 rows 2MASS 9.4x10 8 rows Tycho2 2.5x10 6 rows UCAC4 1.1x10 8 rows Gaia TGAS (private validation team area) 2x10 6 rows Foreseen, first Gaia catalogue >10 9 rows C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 7

8 Need for new paradigm 1. New ways required to access the Gaia catalogue and associated data a. Powerful query mechanism, asynchronicity of results b. One query interface for all archive services and VO services 2. User can not download all catalogue and all data a. Need to have user workspaces IN the Archive User database space, user disk space b. User workspace shareable amongst various users 3. Bring user code to the data a. Part of the user workspace in the archive b. Share code with other users The user works with the data WHERE the data is : Archive 2.0 concept C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 8

9 Gaia Archive Architecture VO protocols archive core systems +ADQL Query language UWS (job scheduler) Database Data Repository VOSpace User Work Space C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 9 ftp User Disk

10 The interrogator / + ADQL Browser GUI C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 10

11 Upload / xmatch / sharing Browser GUI Upload: a table can be uploaded into the user private area Crossmatch: an uploaded table can be crossmatched with any other table Sharing: any private table can be shared with other users C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 11

12 Catalogue crossmatch Browser GUI Crossmatch functionality is provided for any table available: public catalogues, user's tables and shared tables Crossmatch results: Join table in the user's space at server Default crossmatch join query provided Can be shared with other users C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 12

13 Gaia Archive Crossmatch Examples (20 threads) Browser GUI Catalogue 1 Catalogue 2 Radius (arcsec) # results Time Tycho2 2.5x10 6 rows Tycho2 2.5x10 6 rows Tycho2 2.5x10 6 rows Tycho2 2.5x10 6 rows 2MASS PSC 4.7x10 8 rows 2MASS PSC 4.7x10 8 rows IGSL 1.2x10 9 rows IGSL 1.2x10 9 rows 1 2,495,304 49s 5 2,614, s 1 2,600,542 46s 5 2,829,401 55s Tycho2 vs IGSL crossmatches are even faster than the ones with 2MASS as IGSL is located in the fastest local storage (PCIe), even when IGSL (similar to the final Gaia catalogue) is around 3 times bigger than 2MASS C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 13

14 User work space : VOSpace VOSpace User Work Space C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 14

15 VOSpace : Virtual storage for collaboration VOSpace User Work Space Dropbox for the VO Accessible from VO data access protocols Accessible from VO applications (TopCat, Aladin, ) Share with other users Your Files Your Software code Your Virtual Machines C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 15

16 Gaia Archive Architecture VO protocols +ADQL Query language archive core systems UWS (job scheduler) Database Data Repository VOSpace User Work Space C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 16 ftp User Disk

17 Command line Gaia command line interface All what can be done from the Web GUI can be done by script 1. Query (synchronous, asynchronous), login, table upload, crossmatch, download, etc Various languages now available: 1. Python, Java, C++, 2. more to come C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 17

18 Gaia Archive Architecture VO protocols +ADQL Query language archive core systems UWS (job scheduler) SAMP, SSAP Database Data Repository VOSpace User Work Space C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 18 ftp User Disk

19 Interoperability with other VO tools: SAMP (Simple Application Messaging Protocol) VO Application : TOPCAT SAMP SAMP C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 19

20 Gaia Archive Architecture VO protocols +ADQL Query language archive core systems UWS (job scheduler) SAMP, SSAP Database ESASky, HiPS VOSpace User Work Space Data Repository C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 20 ftp User Disk

21 Gaia Archive Architecture VO protocols +ADQL Query language archive core systems UWS (job scheduler) SAMP, SSAP Database ESASky, HiPS VOSpace User Work Space Data Repository C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 21 ftp User Disk

22 Gaia Added Value Interfaces 1. Extension of the Gaia Archive towards Archive of added value Software 2. Collaborative Archive ~ Archive 2.0 a. Users can bring and run their code to the archive (through containers) b. Users can share their data and their code with other archive users 3. Re-use of some of the VO technologies ( to access data, VOSpace to save code and data) 4. Could be used to host Apps developed by anyone a. Specialized visualization (light curve folding, 3D.. ) b. variable analysis c. transient analysis d. simulator execution C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 22

23 Gaia Added Value Interfaces Portal 1. ESA funds for GeoReturn activities, restricted to certain countries a. Industrial contracts coordinated from ESAC (V.Navarro) 2. GAVIP Portal to allow user to upload their code to run near the archive a. Ireland (Parameter space) 3. Four Demonstrators of Added Value Interfaces 1. GAVITA, Transient Alerts interface, a. Ireland (Parameter space) 2. GAVIDAV Advanced Visualisation a. Portugal (Fork.Research, Uninova, FFCUL) 3. GAVISC Spectral Classification a. Finland (Space Systems Finland) 4. GAVITEA Temporal Analysis a. Finland (University of Helsinki) C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 23

24 Big data : Histogram provision 1. Complementary to the query access, need to visualize big data a. Production of density maps, 1D histograms 2. Interactive visualization through VO applications (Aladin lite), integrated into the Archive C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 24

25 Big data : Histograms production 1. Need new big data techniques a. Map / Reduce processing paradigm 2. Need big data machines a. Big RAM, fast disks b. PostgreSQL DB C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 25

26 Conclusions 1. Gaia brings us big data together with new big data challenges a. Some can be addressed with well proven technologies (eg PostgreSQL) b. New technologies required (eg asynchronicity, visualization, Map/Reduce) c. Some key VO standards are fully part of the archive 2. New paradigm shift for Archives and data access services a. User work space inside the archive b. Analysis work is done where the data is 3. Archive 2.0 : open, dynamic and collaborative archive a. Users share their data, their code through VO protocols b. Users participate to the building of the new ecosystem around the archive C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 26

27 Thanks 1. My co-authors, Javier Durán, Juan González, Raúl Gutiérrez, José Hernández, Uwe Lammers, Bruno Merin, Alcione Mora, Sara Nieto, William OMullane, Jesús Salgado, Juan Carlos Segovia 2. ESAC Science Data Centre and Gaia Archive team in particular 3. Gaia Science Operations Centre at ESAC 4. DPAC - Data Processing and Analysis Consortium 5. GAVIP + AVIs teams C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 27

28 This image cannot currently be displayed. IVOA Specifications ADQL: Astronomical Data Query Language Language used to query data : Table Access Protocol A protocol to access tables that contain the data UWS: Universal Worker Service Pattern A jobs scheduler/handler to manage data queries VOSpace: Interface to distributed storage A virtual storage system (a VO dropbox++ ) (Some extensions required to fulfil science needs) SAMP: Simple Application Messaging Protocol A protocol for applications to inter connect amongst them C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 28

The Gaia Archive. Center Forum, Heidelberg, June 10-11, 2013. Stefan Jordan. The Gaia Archive, COSADIE Astronomical Data

The Gaia Archive. Center Forum, Heidelberg, June 10-11, 2013. Stefan Jordan. The Gaia Archive, COSADIE Astronomical Data The Gaia Archive Astronomisches Rechen-Institut am Zentrum für Astronomie der Universität Heidelberg http://www.stefan-jordan.de 1 2 Gaia 2013-2018 and beyond Progress with Gaia 3 HIPPARCOS Gaia accuracy

More information

Archival Science with the ESAC Science Archives and Virtual Observatory

Archival Science with the ESAC Science Archives and Virtual Observatory Archival Science with the ESAC Science Archives and Virtual Observatory Deborah Baines Science Archives and VO Team Scientist European Space Agency (ESA) European Space Astronomy Centre (ESAC) Science

More information

ESA Sky. Bruno Merín Sara Nieto Elena Racero Jesús Salgado María Henar Sarmiento Pilar de Teodoro

ESA Sky. Bruno Merín Sara Nieto Elena Racero Jesús Salgado María Henar Sarmiento Pilar de Teodoro ESA Sky Deborah Baines Javier Castellanos Fabrizio Giordano Juan González Raúl Gutiérrez Belén López Martí Bruno Merín Sara Nieto Elena Racero Jesús Salgado María Henar Sarmiento Pilar de Teodoro Thanks

More information

Exploring Gaia data with TOPCAT and the Virtual Observatory

Exploring Gaia data with TOPCAT and the Virtual Observatory Exploring Gaia data with TOPCAT and the Virtual Observatory Mark Taylor (University of Bristol) Gaia and the Unseen Brown Dwarf Question GREAT-ESF Workshop Torino University 26 March 2014 $Id: tcvo.tex,v

More information

CU9 Science Enabling Applications Development Work Package Software Requirements Specification (WP970)

CU9 Science Enabling Applications Development Work Package Software Requirements Specification (WP970) Science Enabling Applications Development Work Package Software Requirements Specification (WP970) prepared by: approved by: reference: issue: revision: 1 X. Luri, P.M. Marrese, F.Julbe, H. Enke, N. Walton,

More information

Data Lab System Architecture

Data Lab System Architecture Data Lab System Architecture Data Lab Context Data Lab Architecture Astronomer s Desktop Web Page Cmdline Tools Legacy Apps User Code User Mgmt Data Lab Ops Monitoring Presentation Layer Authentication

More information

THE US NATIONAL VIRTUAL OBSERVATORY. IVOA WebServices. William O Mullane The Johns Hopkins University

THE US NATIONAL VIRTUAL OBSERVATORY. IVOA WebServices. William O Mullane The Johns Hopkins University THE US NATIONAL VIRTUAL OBSERVATORY IVOA WebServices William O Mullane The Johns Hopkins University 1 What exactly is a WS? FROM http://dev.w3.org/cvsweb/~checkout~/2002/ws/arch/wsa/wd-wsaarch.html#whatisws

More information

Data Lab Operations Concepts

Data Lab Operations Concepts Data Lab Operations Concepts 1 Introduction This talk will provide an overview of Data Lab components to be implemented Core infrastructure User applications Science Capabilities User Interfaces The scope

More information

The Planck Legacy Archive: current status, contents and future development. Xavier Dupac ESA-ESAC Villanueva de la Cañada, Spain

The Planck Legacy Archive: current status, contents and future development. Xavier Dupac ESA-ESAC Villanueva de la Cañada, Spain The Planck Legacy Archive: current status, contents and future development Xavier Dupac ESA-ESAC Villanueva de la Cañada, Spain Outline Introduction Schedule Scientific contents of the PLA Additional contents

More information

ASKAP Science Data Archive: Users and Requirements CSIRO ASTRONOMY AND SPACE SCIENCE (CASS)

ASKAP Science Data Archive: Users and Requirements CSIRO ASTRONOMY AND SPACE SCIENCE (CASS) ASKAP Science Data Archive: Users and Requirements CSIRO ASTRONOMY AND SPACE SCIENCE (CASS) Jessica Chapman, Data Workshop March 2013 ASKAP Science Data Archive Talk outline Data flow in brief Some radio

More information

Lecture 5b: Data Mining. Peter Wheatley

Lecture 5b: Data Mining. Peter Wheatley Lecture 5b: Data Mining Peter Wheatley Data archives Most astronomical data now available via archives Raw data and high-level products usually available Data reduction software often specific to individual

More information

Observer Access to the Cherenkov Telescope Array

Observer Access to the Cherenkov Telescope Array Observer Access to the Cherenkov Telescope Array IRAP, Toulouse, France E-mail: jknodlseder@irap.omp.eu V. Beckmann APC, Paris, France E-mail: beckmann@apc.in2p3.fr C. Boisson LUTh, Paris, France E-mail:

More information

Software challenges in the implementation of large surveys: the case of J-PAS

Software challenges in the implementation of large surveys: the case of J-PAS Software challenges in the implementation of large surveys: the case of J-PAS 1/21 Paulo Penteado - IAG/USP pp.penteado@gmail.com http://www.ppenteado.net/ast/pp_lsst_201204.pdf (K. Taylor) (A. Fernández-Soto)

More information

The Murchison Widefield Array Data Archive System. Chen Wu Int l Centre for Radio Astronomy Research The University of Western Australia

The Murchison Widefield Array Data Archive System. Chen Wu Int l Centre for Radio Astronomy Research The University of Western Australia The Murchison Widefield Array Data Archive System Chen Wu Int l Centre for Radio Astronomy Research The University of Western Australia Agenda Dataflow Requirements Solutions & Lessons learnt Open solution

More information

MAST: The Mikulski Archive for Space Telescopes

MAST: The Mikulski Archive for Space Telescopes MAST: The Mikulski Archive for Space Telescopes Richard L. White Space Telescope Science Institute 2015 April 1, NRC Space Science Week/CBPSS A model for open access The NASA astrophysics data archives

More information

EChO Ground Segment: Overview & Science Operations Assumptions

EChO Ground Segment: Overview & Science Operations Assumptions EChO Ground Segment: Overview & Science Operations Assumptions Matthias Ehle & the Science Ground Segment Working Group EChO Science Operations Study Manager ESA-ESAC, Madrid Science Operations Department/Division

More information

and the VO-Science Francisco Jiménez Esteban Suffolk University

and the VO-Science Francisco Jiménez Esteban Suffolk University The Spanish-VO and the VO-Science Francisco Jiménez Esteban CAB / SVO (INTA-CSIC) Suffolk University The Spanish-VO (SVO) IVOA was created in June 2002 with the mission to facilitate the international

More information

Multidimensional Data in the Virtual Observatory

Multidimensional Data in the Virtual Observatory IX Reunión Científica de la SEA Madrid- 15/09/2010 Red Temática SVO Multidimensional Data in the Virtual Observatory José Enrique Ruiz Grupo AMIGA Instituto de Astrofísica de Andalucía CSIC Contextual

More information

LSST Resources for Data Analysis

LSST Resources for Data Analysis LSST Resources for the Community Lynne Jones University of Washington/LSST 1 Data Flow Nightly Operations : (at base facility) Each 15s exposure = 6.44 GB (raw) 2x15s = 1 visit 30 TB / night Generates

More information

CAUP s Astronomical Instrumentation and Surveys

CAUP s Astronomical Instrumentation and Surveys CAUP s Astronomical Instrumentation and Surveys CENTRO DE ASTROFÍSICA DA UNIVERSIDADE DO PORTO www.astro.up.pt Sérgio A. G. Sousa Team presentation sousasag@astro.up.pt CAUP's Astronomical Instrumentation

More information

The Virtual Observatory in Action

The Virtual Observatory in Action The Virtual Observatory in Action VO drivers VO vision VO progress World AstroGrid VO Desktop demo Oxford erc Andy Lawrence Jan 2008 VO drivers : science science services several trends lead to science

More information

Organization of VizieR's Catalogs Archival

Organization of VizieR's Catalogs Archival Organization of VizieR's Catalogs Archival Organization of VizieR's Catalogs Archival Table of Contents Foreword...2 Environment applied to VizieR archives...3 The archive... 3 The producer...3 The user...3

More information

Australian Virtual Observatory

Australian Virtual Observatory Australian Virtual Observatory International Astronomical Union GA 2003 Joint Discussion 08 17th-18th July 2003 Sydney David Barnes The University of Melbourne Our take on virtual observatories bring legacy

More information

The ISO Data Archive

The ISO Data Archive the iso data archive The ISO Data Archive C. Arviset & T. Prusti ISO Data Centre, ESA Directorate of Scientific Programmes, Villafranca, Spain Introduction ISO was the world s first true orbiting astronomical

More information

How To Understand And Understand The Science Of Astronomy

How To Understand And Understand The Science Of Astronomy Introduction to the VO Christophe.Arviset@esa.int ESAVO ESA/ESAC Madrid, Spain The way Astronomy works Telescopes (ground- and space-based, covering the full electromagnetic spectrum) Observatories Instruments

More information

LSST and the Cloud: Astro Collaboration in 2016 Tim Axelrod LSST Data Management Scientist

LSST and the Cloud: Astro Collaboration in 2016 Tim Axelrod LSST Data Management Scientist LSST and the Cloud: Astro Collaboration in 2016 Tim Axelrod LSST Data Management Scientist DERCAP Sydney, Australia, 2009 Overview of Presentation LSST - a large-scale Southern hemisphere optical survey

More information

The NOAO Science Archive and NVO Portal: Information & Guidelines

The NOAO Science Archive and NVO Portal: Information & Guidelines The NOAO Science Archive and NVO Portal: Information & Guidelines Mark Dickinson, 14 March 2008 Orig. document by Howard Lanning & Mark Dickinson Thank you for your help testing the new NOAO Science Archive

More information

Data centres in the. Virtual Observatory. F. Genova, IVOA Small Project meeting, September 2006 1

Data centres in the. Virtual Observatory. F. Genova, IVOA Small Project meeting, September 2006 1 Data centres in the Virtual Observatory F. Genova, IVOA Small Project meeting, September 2006 1 VO status (1) Many national projects Very different contexts/financing agencies A really world-wide, global

More information

Galaxy Survey data analysis using SDSS-III as an example

Galaxy Survey data analysis using SDSS-III as an example Galaxy Survey data analysis using SDSS-III as an example Will Percival (University of Portsmouth) showing work by the BOSS galaxy clustering working group" Cosmology from Spectroscopic Galaxy Surveys"

More information

The Mantid Project. The challenges of delivering flexible HPC for novice end users. Nicholas Draper SOS18

The Mantid Project. The challenges of delivering flexible HPC for novice end users. Nicholas Draper SOS18 The Mantid Project The challenges of delivering flexible HPC for novice end users Nicholas Draper SOS18 What Is Mantid A framework that supports high-performance computing and visualisation of scientific

More information

Astro Runtime An API for the Virtual Observatory

Astro Runtime An API for the Virtual Observatory A PPARC funded project Astro Runtime An API for the Virtual Observatory Noel Winstanley - Jodrell Bank Observatory Astro Runtime A library of virtual-observatory functions and clients. integrates VO standards,

More information

CoSADIE Data Centre Forum. Summary and conclusions

CoSADIE Data Centre Forum. Summary and conclusions CoSADIE Data Centre Forum Summary and conclusions A forum for the data centre community Tell the story of what they do and of their relationship with the VO Know each other better Community building Communication

More information

Research Data Storage, Sharing, and Transfer Options

Research Data Storage, Sharing, and Transfer Options Research Data Storage, Sharing, and Transfer Options Principal investigators should establish a research data management system for their projects including procedures for storing working data collected

More information

Bringing Big Data to the Solar System. Paulo Penteado Northern Arizona University, Flagstaff (visiting David Trilling)

Bringing Big Data to the Solar System. Paulo Penteado Northern Arizona University, Flagstaff (visiting David Trilling) Bringing Big Data to the Solar System Paulo Penteado Northern Arizona University, Flagstaff (visiting David Trilling) pp.penteado@gmail.com http://www.ppenteado.net What is Big Data and why do we care?

More information

Data Centre Alliance - Science

Data Centre Alliance - Science Data Centre Alliance - Science Mark ALLEN, Jonathan TEDDS & DCA IST Nov 2008 IST Internal Science Team Activities of the IST are being rounded off for end of project Coordinated by IST telecon IST: T2-DCA-IST

More information

Cross-Matching Very Large Datasets

Cross-Matching Very Large Datasets 1 Cross-Matching Very Large Datasets María A. Nieto-Santisteban, Aniruddha R. Thakar, and Alexander S. Szalay Johns Hopkins University Abstract The primary mission of the National Virtual Observatory (NVO)

More information

Science@ESA vodcast series. Script for Episode 6 Charting the Galaxy - from Hipparcos to Gaia

Science@ESA vodcast series. Script for Episode 6 Charting the Galaxy - from Hipparcos to Gaia Science@ESA vodcast series Script for Episode 6 Charting the Galaxy - from Hipparcos to Gaia Available to download from http://sci.esa.int/gaia/vodcast Hello, I m Rebecca Barnes and welcome to the Science@ESA

More information

Astronomical Data Analysis Software & Systems XVI

Astronomical Data Analysis Software & Systems XVI Astronomical Data Analysis Software & Systems XVI 15-18 October 2006 Tucson, Arizona, USA Events ADASS XVI Today Calendar Conference Schedule Meeting Program Recent News Birds of a Feather Banquet Conference

More information

PRESENTATION SPACE MISSIONS

PRESENTATION SPACE MISSIONS GENERAL PRESENTATION SPACE MISSIONS CONTENTS 1. Who we are 2. What we do 3. Space main areas 4. Space missions Page 2 WHO WE ARE GENERAL Multinational conglomerate founded in 1984 Private capital Offices

More information

Intro to Sessions 3 & 4: Data Management & Data Analysis. Bob Mann Wide-Field Astronomy Unit University of Edinburgh

Intro to Sessions 3 & 4: Data Management & Data Analysis. Bob Mann Wide-Field Astronomy Unit University of Edinburgh Intro to Sessions 3 & 4: Data Management & Data Analysis Bob Mann Wide-Field Astronomy Unit University of Edinburgh 1 Outline Data Management Issues Alternatives to monolithic RDBMS model Intercontinental

More information

To begin, visit this URL: http://www.ibm.com/software/rational/products/rdp

To begin, visit this URL: http://www.ibm.com/software/rational/products/rdp Rational Developer for Power (RDp) Trial Download and Installation Instructions Notes You should complete the following instructions using Internet Explorer or Firefox with Java enabled. You should disable

More information

FRACTAL SYSTEM & PROJECT SUITE: ENGINEERING TOOLS FOR IMPROVING DEVELOPMENT AND OPERATION OF THE SYSTEMS. (Spain); ABSTRACT 1.

FRACTAL SYSTEM & PROJECT SUITE: ENGINEERING TOOLS FOR IMPROVING DEVELOPMENT AND OPERATION OF THE SYSTEMS. (Spain); ABSTRACT 1. FRACTAL SYSTEM & PROJECT SUITE: ENGINEERING TOOLS FOR IMPROVING DEVELOPMENT AND OPERATION OF THE SYSTEMS A. Pérez-Calpena a, E. Mujica-Alvarez, J. Osinde-Lopez a, M. García-Vargas a a FRACTAL SLNE. C/

More information

D. Briukhov, L. Kalinichenko, i D. Martynov, N. Skvortsov, S.Stupnikov, A. Vovchenko, V. Zakharov, O. Zhelenkova

D. Briukhov, L. Kalinichenko, i D. Martynov, N. Skvortsov, S.Stupnikov, A. Vovchenko, V. Zakharov, O. Zhelenkova APPLICATION DRIVEN MEDIATION MIDDLEWARE GRID-INFRASTRUCTUREINFRASTRUCTURE FOR PROBLEM SOLVING OVER MULTIPLE HETEROGENEOUS DISTRIBUTED INFORMATION RESOURCES The Third International Conference "Distributed

More information

ALMA Technical Support. George J. Bendo UK ALMA Regional Centre Node University of Manchester

ALMA Technical Support. George J. Bendo UK ALMA Regional Centre Node University of Manchester ALMA Technical Support George J. Bendo UK ALMA Regional Centre Node University of Manchester Overview ALMA organisation and services Websites o Web portal o Helpdesk Documentation Software o CASA o Observing

More information

CADC and CANFAR: Extending the role of the data centre. Séverin Gaudet Canadian Astronomy Data Centre

CADC and CANFAR: Extending the role of the data centre. Séverin Gaudet Canadian Astronomy Data Centre CADC and CANFAR: Extending the role of the data centre Séverin Gaudet Canadian Astronomy Data Centre February 2012 Canadian Astronomy Data Centre Heterogeneous collection: Multiple missions, facilities

More information

MySQL Enterprise Monitor

MySQL Enterprise Monitor MySQL Enterprise Monitor Lynn Ferrante Principal Sales Consultant 1 Program Agenda MySQL Enterprise Monitor Overview Architecture Roles Demo 2 Overview 3 MySQL Enterprise Edition Highest Levels of Security,

More information

Deployment of Intersystems Caché with GUMS on Amazon EC2

Deployment of Intersystems Caché with GUMS on Amazon EC2 Deployment of Intersystems Caché with GUMS on Amazon EC2 prepared by: Daniel Tapiador affiliation : ESAC Science Archives and VO Team approved by: GAP reference: issue: 0D revision: 0 date: 2011-10-18

More information

CASA Analysis and Visualization

CASA Analysis and Visualization CASA Analysis and Visualization Synthesis... 1 Current Status... 1 General Goals and Challenges... 3 Immediate Goals... 5 Harnessing Community Development... 7 Synthesis We summarize capabilities and challenges

More information

Living Requirements Document: Sniffit

Living Requirements Document: Sniffit Living Requirements Document: Sniffit RFID locator system Andrew Pang Braulio Fonseca Enrique Gutierrez Nader Khalil Sohan Shah Victor Porter Introduction Sniffit is a handy tracking application that helps

More information

Big Data, Cloud & Virtualization

Big Data, Cloud & Virtualization Big Data, Cloud & Virtualization Tokyo, 2014 Vik Nagjee Product Manager, Database Platforms Big Data 1 What s Big about {Big} Data? The 3 V s Volume Variety Velocity The {Big} Data Challenge Image credit:

More information

Virtual machine W4M- Galaxy: Installation guide

Virtual machine W4M- Galaxy: Installation guide Virtual machine W4M- Galaxy: Installation guide Christophe Duperier August, 6 th 2014 v03 This document describes the installation procedure and the functionalities provided by the W4M- Galaxy virtual

More information

VAO Single Sign-on with OpenID

VAO Single Sign-on with OpenID VAO Single Sign-on with OpenID Ray Plante VAO NCSA 20 October 2011 IVOA Interoperability 20 Meeting October -- Pune 2011 IVOA Interoperability Meeting -- Pune Common Identities across the VO VAO Single

More information

Mac OS X Security Checklist:

Mac OS X Security Checklist: Mac OS X Security Checklist: Implementing the Center for Internet Security Benchmark for OS X Recommendations for securing Mac OS X The Center for Internet Security (CIS) benchmark for OS X is widely regarded

More information

Managing Large Imagery Databases via the Web

Managing Large Imagery Databases via the Web 'Photogrammetric Week 01' D. Fritsch & R. Spiller, Eds. Wichmann Verlag, Heidelberg 2001. Meyer 309 Managing Large Imagery Databases via the Web UWE MEYER, Dortmund ABSTRACT The terramapserver system is

More information

Data Management Plan Extended Baryon Oscillation Spectroscopic Survey

Data Management Plan Extended Baryon Oscillation Spectroscopic Survey Data Management Plan Extended Baryon Oscillation Spectroscopic Survey Experiment description: eboss is the cosmological component of the fourth generation of the Sloan Digital Sky Survey (SDSS-IV) located

More information

SGS System Requirements Review

SGS System Requirements Review SGS System Requirements Review John Hoar Lausanne 09/06/2015 Quick recap The Science Ground Segment transforms measurements made by the Euclid instruments into data products ready for scientific use. This

More information

Usage statistics and archiving process of VizieR data in the VO context

Usage statistics and archiving process of VizieR data in the VO context Usage statistics and archiving process of VizieR data in the VO context VO Implementation status VO Implementation status Application VOTable (1.1 1.2 1.3) Semantic Data Access Layer Data Model MOC SAMP

More information

United Nations - Nations Unies. COSPAR Symposium. Measuring the Universe. Looking Back in Time with Modern Astronomy. Monday, 2nd February 2015

United Nations - Nations Unies. COSPAR Symposium. Measuring the Universe. Looking Back in Time with Modern Astronomy. Monday, 2nd February 2015 United Nations - Nations Unies COSPAR Symposium Measuring the Universe Looking Back in Time with Modern Astronomy Monday, 2nd February 2015 15:00 18:00 Conference Rooms M1, Building M, Vienna International

More information

Data Mining with Hadoop at TACC

Data Mining with Hadoop at TACC Data Mining with Hadoop at TACC Weijia Xu Data Mining & Statistics Data Mining & Statistics Group Main activities Research and Development Developing new data mining and analysis solutions for practical

More information

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next

More information

The Rise of Industrial Big Data. Brian Courtney General Manager Industrial Data Intelligence

The Rise of Industrial Big Data. Brian Courtney General Manager Industrial Data Intelligence The Rise of Industrial Big Data Brian Courtney General Manager Industrial Data Intelligence Agenda Introduction Big Data for the industrial sector Case in point: Big data saves millions at GE Energy Seeking

More information

Concepts and Architecture of Grid Computing. Advanced Topics Spring 2008 Prof. Robert van Engelen

Concepts and Architecture of Grid Computing. Advanced Topics Spring 2008 Prof. Robert van Engelen Concepts and Architecture of Grid Computing Advanced Topics Spring 2008 Prof. Robert van Engelen Overview Grid users: who are they? Concept of the Grid Challenges for the Grid Evolution of Grid systems

More information

SURFsara Data Services

SURFsara Data Services SURFsara Data Services SUPPORTING DATA-INTENSIVE SCIENCES Mark van de Sanden The world of the many Many different users (well organised (international) user communities, research groups, universities,

More information

A Study of Data Management Technology for Handling Big Data

A Study of Data Management Technology for Handling Big Data Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 9, September 2014,

More information

State of SIEM Challenges, Myths & technology Landscape 4/21/2013 1

State of SIEM Challenges, Myths & technology Landscape 4/21/2013 1 State of SIEM Challenges, Myths & technology Landscape 4/21/2013 1 Introduction What s in a name? SIEM? SEM? SIM? Technology Drivers Challenges & Technology Overview Deciding what s right for you Worst

More information

DiamondStream Data Security Policy Summary

DiamondStream Data Security Policy Summary DiamondStream Data Security Policy Summary Overview This document describes DiamondStream s standard security policy for accessing and interacting with proprietary and third-party client data. This covers

More information

DCA QUESTIONNAIRE V0.1-1 INTRODUCTION AND IDENTIFICATION OF THE DATA CENTRE

DCA QUESTIONNAIRE V0.1-1 INTRODUCTION AND IDENTIFICATION OF THE DATA CENTRE DCA QUESTIONNAIRE V0.1-1 INTRODUCTION AND IDENTIFICATION OF THE DATA CENTRE Introduction - The EuroVO-DCA Census questionnaire The Euro-VO Data Centre Alliance (http://www.euro-vo.org/pub/dca/overview.html)

More information

SSL VPN. Virtual Appliance Installation Guide. Virtual Private Networks

SSL VPN. Virtual Appliance Installation Guide. Virtual Private Networks SSL VPN Virtual Appliance Installation Guide Virtual Private Networks C ONTENTS Introduction... 2 Installing the Virtual Appliance... 2 Configuring Appliance Operating System Settings... 3 Setting up the

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

Data providers technical feedback

Data providers technical feedback Data providers technical feedback DCA WP 3.2 (LU) Report May 2007 Anita Richards (JBO) & Jonathan Tedds (LU) Terms of reference Activities up to 1 st PCT meeting Thurs Session 1 Plans Thurs Session 3 WP

More information

Karl Lum Partner, LabKey Software klum@labkey.com. Evolution of Connectivity in LabKey Server

Karl Lum Partner, LabKey Software klum@labkey.com. Evolution of Connectivity in LabKey Server Karl Lum Partner, LabKey Software klum@labkey.com Evolution of Connectivity in LabKey Server Connecting Data to LabKey Server Lowering the barrier to connect scientific data to LabKey Server Increased

More information

TDRS / MUST. and. what it might do for you

TDRS / MUST. and. what it might do for you TDRS / MUST and what it might do for you Dr. Marcus G. F. Kirsch XMM-Newton Deputy Spacecraft Operations Manager with Inputs from José-Antonio Martínez nez-heras, Black Hat S.L., Spain European Space Agency

More information

Databricks. A Primer

Databricks. A Primer Databricks A Primer Who is Databricks? Databricks vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache Spark, a powerful

More information

A New Data Visualization and Analysis Tool

A New Data Visualization and Analysis Tool Title: A New Data Visualization and Analysis Tool Author: Kern Date: 22 February 2013 NRAO Doc. #: Version: 1.0 A New Data Visualization and Analysis Tool PREPARED BY ORGANIZATION DATE Jeff Kern NRAO 22

More information

NEXT GENERATION ARCHIVE MIGRATION TOOLS

NEXT GENERATION ARCHIVE MIGRATION TOOLS NEXT GENERATION ARCHIVE MIGRATION TOOLS Cloud Ready, Scalable, & Highly Customizable - Migrate 6.0 Ensures Faster & Smarter Migrations EXECUTIVE SUMMARY Data migrations and the products used to perform

More information

GEOCOMPUTATIONS AND RELATED WEB SERVICES

GEOCOMPUTATIONS AND RELATED WEB SERVICES GEOCOMPUTATIONS AND RELATED WEB SERVICES J. A. Rod Blais Dept. of Geomatics Engineering Pacific Institute for the Mathematical Sciences University of Calgary, Calgary, Alberta T2N 1N4 blais@ucalgary.ca

More information

Using the Parkes Pulsar Data Archive

Using the Parkes Pulsar Data Archive JART http://www.jart.ac.cn Using the Parkes Pulsar Data Archive J. Khoo 1, G. Hobbs 1, R. N. Manchester 1, D. Miller 2, J. Dempsey 2 1 CSIRO Astronomy and Space Science, Australia Telescope National Facility,

More information

Scaling Big Data Mining Infrastructure: The Smart Protection Network Experience

Scaling Big Data Mining Infrastructure: The Smart Protection Network Experience Scaling Big Data Mining Infrastructure: The Smart Protection Network Experience 黃 振 修 (Chris Huang) SPN 主 動 式 雲 端 截 毒 技 術 架 構 師 About Me SPN 主 動 式 雲 端 截 毒 技 術 架 構 師 SPN Hadoop 基 礎 運 算 架 構 師 Hadoop in Taiwan

More information

CARRIOTS TECHNICAL PRESENTATION

CARRIOTS TECHNICAL PRESENTATION CARRIOTS TECHNICAL PRESENTATION Alvaro Everlet, CTO alvaro.everlet@carriots.com @aeverlet Oct 2013 CARRIOTS TECHNICAL PRESENTATION 1. WHAT IS CARRIOTS 2. BUILDING AN IOT PROJECT 3. DEVICES 4. PLATFORM

More information

Big Data and evolution of the Ground System EO ENG and the imarine case

Big Data and evolution of the Ground System EO ENG and the imarine case Big Data and evolution of the Ground System EO ENG and the imarine case Andrea Manieri Engineering R&D Lab. Rome, 26/11/2013 1 1 AGENDA The Big data challenges seen from the space Engineering and (some)

More information

On-line supplement to manuscript Galaxy for collaborative analysis of ENCODE data: Making large-scale analyses biologist-friendly

On-line supplement to manuscript Galaxy for collaborative analysis of ENCODE data: Making large-scale analyses biologist-friendly On-line supplement to manuscript Galaxy for collaborative analysis of ENCODE data: Making large-scale analyses biologist-friendly DANIEL BLANKENBERG, JAMES TAYLOR, IAN SCHENCK, JIANBIN HE, YI ZHANG, MATTHEW

More information

Software Development for Virtual Observatories

Software Development for Virtual Observatories Software Development for Virtual Observatories BRAVO Workshop February 2007 Rafael Santos 1 Warning! This presentation is biased. I'll talk about VO software development, including some under the hood

More information

Databricks. A Primer

Databricks. A Primer Databricks A Primer Who is Databricks? Databricks was founded by the team behind Apache Spark, the most active open source project in the big data ecosystem today. Our mission at Databricks is to dramatically

More information

Classroom Exercise ASTR 390 Selected Topics in Astronomy: Astrobiology A Hertzsprung-Russell Potpourri

Classroom Exercise ASTR 390 Selected Topics in Astronomy: Astrobiology A Hertzsprung-Russell Potpourri Classroom Exercise ASTR 390 Selected Topics in Astronomy: Astrobiology A Hertzsprung-Russell Potpourri Purpose: 1) To understand the H-R Diagram; 2) To understand how the H-R Diagram can be used to follow

More information

Archival of raw and analysed radar data at EISCAT and worldwide

Archival of raw and analysed radar data at EISCAT and worldwide Archival of raw and analysed radar data at EISCAT and worldwide Carl-Fredrik Enell, EISCAT Scientific Association COOPEUS workshop and EGI-CC kickoff, 11 March 2015 C-F Enell, EISCAT Radar data archival

More information

Data-intensive HPC: opportunities and challenges. Patrick Valduriez

Data-intensive HPC: opportunities and challenges. Patrick Valduriez Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,

More information

Outcomes of the CDS Technical Infrastructure Workshop

Outcomes of the CDS Technical Infrastructure Workshop Outcomes of the CDS Technical Infrastructure Workshop Baudouin Raoult Baudouin.raoult@ecmwf.int Funded by the European Union Implemented by Evaluation & QC function C3S architecture from European commission

More information

THE CCLRC DATA PORTAL

THE CCLRC DATA PORTAL THE CCLRC DATA PORTAL Glen Drinkwater, Shoaib Sufi CCLRC Daresbury Laboratory, Daresbury, Warrington, Cheshire, WA4 4AD, UK. E-mail: g.j.drinkwater@dl.ac.uk, s.a.sufi@dl.ac.uk Abstract: The project aims

More information

Why a single source for assets should be. the backbone of all your digital activities

Why a single source for assets should be. the backbone of all your digital activities Why a single source for assets should be the backbone of all your digital activities Navigating in the digital landscape The old era of traditional marketing has long passed. Today, customers expect to

More information

Research Data Storage, Sharing, and Transfer Options

Research Data Storage, Sharing, and Transfer Options Research Data Storage, Sharing, and Transfer Options Principal investigators should establish a research data management system for their projects including procedures for storing working data collected

More information

Integrated Performance Monitoring

Integrated Performance Monitoring Integrated Performance Monitoring JENNIFER provides comprehensive and integrated performance monitoring through its many dashboard views, which include Realuser Monitoring and Real-time Topology. USING

More information

Optimizing IT Deployment Issues

Optimizing IT Deployment Issues Optimizing IT Deployment Issues Trends and Challenges for Engineering Simulation Barbara Hutchings barbara.hutchings@ansys.com 1 Outline Deployment Challenges and Trends Extreme scale up and scale out

More information

Canadian Astronomy Data Centre. Séverin Gaudet David Schade Canadian Astronomy Data Centre

Canadian Astronomy Data Centre. Séverin Gaudet David Schade Canadian Astronomy Data Centre Canadian Astronomy Data Centre Séverin Gaudet David Schade Canadian Astronomy Data Centre Data Activities in Astronomy Features of the astronomy data landscape Multi-wavelength datasets are increasingly

More information

SOA, case Google. Faculty of technology management 07.12.2009 Information Technology Service Oriented Communications CT30A8901.

SOA, case Google. Faculty of technology management 07.12.2009 Information Technology Service Oriented Communications CT30A8901. Faculty of technology management 07.12.2009 Information Technology Service Oriented Communications CT30A8901 SOA, case Google Written by: Sampo Syrjäläinen, 0337918 Jukka Hilvonen, 0337840 1 Contents 1.

More information

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging Outline High Performance Computing (HPC) Towards exascale computing: a brief history Challenges in the exascale era Big Data meets HPC Some facts about Big Data Technologies HPC and Big Data converging

More information

WOS Cloud. ddn.com. Personal Storage for the Enterprise. DDN Solution Brief

WOS Cloud. ddn.com. Personal Storage for the Enterprise. DDN Solution Brief DDN Solution Brief Personal Storage for the Enterprise WOS Cloud Secure, Shared Drop-in File Access for Enterprise Users, Anytime and Anywhere 2011 DataDirect Networks. All Rights Reserved DDN WOS Cloud

More information

Pure1 Manage User Guide

Pure1 Manage User Guide User Guide 11/2015 Contents Overview... 2 Pure1 Manage Navigation... 3 Pure1 Manage - Arrays Page... 5 Card View... 5 Expanded Card View... 7 List View... 10 Pure1 Manage Replication Page... 11 Pure1

More information

Google Cloud Data Platform & Services. Gregor Hohpe

Google Cloud Data Platform & Services. Gregor Hohpe Google Cloud Data Platform & Services Gregor Hohpe All About Data We Have More of It Internet data more easily available Logs user & system behavior Cheap Storage keep more of it 3 Beyond just Relational

More information

SITools2 as VO service provider: an example with Herschel at IDOC (Integrated Data and Operation Center)

SITools2 as VO service provider: an example with Herschel at IDOC (Integrated Data and Operation Center) SITools2 as VO service provider: an example with Herschel at IDOC (Integrated Data and Operation Center) SITools 2 SITools2 is a CNES generic tool performed by a joint effort between CNES and scienefic

More information

MONITORING RED HAT GLUSTER SERVER DEPLOYMENTS With the Nagios IT infrastructure monitoring tool

MONITORING RED HAT GLUSTER SERVER DEPLOYMENTS With the Nagios IT infrastructure monitoring tool TECHNOLOGY DETAIL MONITORING RED HAT GLUSTER SERVER DEPLOYMENTS With the Nagios IT infrastructure monitoring tool INTRODUCTION Storage system monitoring is a fundamental task for a storage administrator.

More information