Big Data, Big Challenges and new Paradigm for the Gaia Archive
|
|
- Nelson Marshall
- 7 years ago
- Views:
Transcription
1 Big Data, Big Challenges and new Paradigm for the Gaia Archive Christophe Arviset Head of ESAC Science Data Centre 15/03/2016 Issue/Revision: 1.0 Reference: Gaia Archive's big data challenges Status: Issued
2 ESAC Science Data Centre The Digital Library of the Universe At ESA s European Space Astronomy Centre Near Madrid, Spain Science Archives from >15 space missions: Astronomy, Planetary, Solar System, From all phases (dev, ops, post-ops, legacy) Calibrated processed data, high level data products, raw data, Different Users: Scientific Community (public access) PI team and observers (controlled access) Science Operations Team (privileged access) Common Software Architecture and Look and Feel C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 2
3 Gaia Satellite and Data Overview 1. ESA Corner Stone missions, launched 19/12/ Stereoscopic Census of the Galaxy over 5 years a. 1-2 billions sources with unprecedented accuracy b. 100TB downlink c. Up to 1PB calibrated data telescope transits astrometric observations 150 x 10 6 Spectra 3. Big data processing challenge as well! a. (outside the scope of this presentation) 4. 1 st public release of Gaia catalogue in summer ~1 new release per year 6. Final catalogue ~2022 C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 3
4 Gaia is definitely a major astronomy Big Data project Volume 1PB of data in total, not really big data Velocity Massively complex data processing challenges, FLOP Variety Source catalogue, spectras, telescope transits Veracity Astrometry, photometry and spectroscopy with high quality Value Believed to revolutionize astronomy Most accurate, consistent, complete, and challenging astrometric data set to date C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 4
5 Standard Archives Architecture Command line http science archive Browser GUI http VO Apps SAMP SIAP, SSAP, ftp VO services Database Data Repository User Disk C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 5
6 ESAC Archives Volume evolution All data stored on hard disks and distributed through Internet Euclid will add up to ~150 PBs by 2023 C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 6
7 Gaia Archive current content Simulations GUMS Milky Way: 2x10 9 rows Large Magellanic Cloud: 7.5x10 6 rows Small Magellanic Cloud: 1.2x10 6 rows Galaxies: 38x10 6 rows Quasars: 10 6 rows GOG 1.8x10 9 rows External Catalogues IGSL (Initial Gaia Source List) 1.2x10 9 rows 2MASS 9.4x10 8 rows Tycho2 2.5x10 6 rows UCAC4 1.1x10 8 rows Gaia TGAS (private validation team area) 2x10 6 rows Foreseen, first Gaia catalogue >10 9 rows C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 7
8 Need for new paradigm 1. New ways required to access the Gaia catalogue and associated data a. Powerful query mechanism, asynchronicity of results b. One query interface for all archive services and VO services 2. User can not download all catalogue and all data a. Need to have user workspaces IN the Archive User database space, user disk space b. User workspace shareable amongst various users 3. Bring user code to the data a. Part of the user workspace in the archive b. Share code with other users The user works with the data WHERE the data is : Archive 2.0 concept C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 8
9 Gaia Archive Architecture VO protocols archive core systems +ADQL Query language UWS (job scheduler) Database Data Repository VOSpace User Work Space C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 9 ftp User Disk
10 The interrogator / + ADQL Browser GUI C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 10
11 Upload / xmatch / sharing Browser GUI Upload: a table can be uploaded into the user private area Crossmatch: an uploaded table can be crossmatched with any other table Sharing: any private table can be shared with other users C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 11
12 Catalogue crossmatch Browser GUI Crossmatch functionality is provided for any table available: public catalogues, user's tables and shared tables Crossmatch results: Join table in the user's space at server Default crossmatch join query provided Can be shared with other users C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 12
13 Gaia Archive Crossmatch Examples (20 threads) Browser GUI Catalogue 1 Catalogue 2 Radius (arcsec) # results Time Tycho2 2.5x10 6 rows Tycho2 2.5x10 6 rows Tycho2 2.5x10 6 rows Tycho2 2.5x10 6 rows 2MASS PSC 4.7x10 8 rows 2MASS PSC 4.7x10 8 rows IGSL 1.2x10 9 rows IGSL 1.2x10 9 rows 1 2,495,304 49s 5 2,614, s 1 2,600,542 46s 5 2,829,401 55s Tycho2 vs IGSL crossmatches are even faster than the ones with 2MASS as IGSL is located in the fastest local storage (PCIe), even when IGSL (similar to the final Gaia catalogue) is around 3 times bigger than 2MASS C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 13
14 User work space : VOSpace VOSpace User Work Space C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 14
15 VOSpace : Virtual storage for collaboration VOSpace User Work Space Dropbox for the VO Accessible from VO data access protocols Accessible from VO applications (TopCat, Aladin, ) Share with other users Your Files Your Software code Your Virtual Machines C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 15
16 Gaia Archive Architecture VO protocols +ADQL Query language archive core systems UWS (job scheduler) Database Data Repository VOSpace User Work Space C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 16 ftp User Disk
17 Command line Gaia command line interface All what can be done from the Web GUI can be done by script 1. Query (synchronous, asynchronous), login, table upload, crossmatch, download, etc Various languages now available: 1. Python, Java, C++, 2. more to come C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 17
18 Gaia Archive Architecture VO protocols +ADQL Query language archive core systems UWS (job scheduler) SAMP, SSAP Database Data Repository VOSpace User Work Space C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 18 ftp User Disk
19 Interoperability with other VO tools: SAMP (Simple Application Messaging Protocol) VO Application : TOPCAT SAMP SAMP C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 19
20 Gaia Archive Architecture VO protocols +ADQL Query language archive core systems UWS (job scheduler) SAMP, SSAP Database ESASky, HiPS VOSpace User Work Space Data Repository C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 20 ftp User Disk
21 Gaia Archive Architecture VO protocols +ADQL Query language archive core systems UWS (job scheduler) SAMP, SSAP Database ESASky, HiPS VOSpace User Work Space Data Repository C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 21 ftp User Disk
22 Gaia Added Value Interfaces 1. Extension of the Gaia Archive towards Archive of added value Software 2. Collaborative Archive ~ Archive 2.0 a. Users can bring and run their code to the archive (through containers) b. Users can share their data and their code with other archive users 3. Re-use of some of the VO technologies ( to access data, VOSpace to save code and data) 4. Could be used to host Apps developed by anyone a. Specialized visualization (light curve folding, 3D.. ) b. variable analysis c. transient analysis d. simulator execution C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 22
23 Gaia Added Value Interfaces Portal 1. ESA funds for GeoReturn activities, restricted to certain countries a. Industrial contracts coordinated from ESAC (V.Navarro) 2. GAVIP Portal to allow user to upload their code to run near the archive a. Ireland (Parameter space) 3. Four Demonstrators of Added Value Interfaces 1. GAVITA, Transient Alerts interface, a. Ireland (Parameter space) 2. GAVIDAV Advanced Visualisation a. Portugal (Fork.Research, Uninova, FFCUL) 3. GAVISC Spectral Classification a. Finland (Space Systems Finland) 4. GAVITEA Temporal Analysis a. Finland (University of Helsinki) C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 23
24 Big data : Histogram provision 1. Complementary to the query access, need to visualize big data a. Production of density maps, 1D histograms 2. Interactive visualization through VO applications (Aladin lite), integrated into the Archive C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 24
25 Big data : Histograms production 1. Need new big data techniques a. Map / Reduce processing paradigm 2. Need big data machines a. Big RAM, fast disks b. PostgreSQL DB C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 25
26 Conclusions 1. Gaia brings us big data together with new big data challenges a. Some can be addressed with well proven technologies (eg PostgreSQL) b. New technologies required (eg asynchronicity, visualization, Map/Reduce) c. Some key VO standards are fully part of the archive 2. New paradigm shift for Archives and data access services a. User work space inside the archive b. Analysis work is done where the data is 3. Archive 2.0 : open, dynamic and collaborative archive a. Users share their data, their code through VO protocols b. Users participate to the building of the new ecosystem around the archive C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 26
27 Thanks 1. My co-authors, Javier Durán, Juan González, Raúl Gutiérrez, José Hernández, Uwe Lammers, Bruno Merin, Alcione Mora, Sara Nieto, William OMullane, Jesús Salgado, Juan Carlos Segovia 2. ESAC Science Data Centre and Gaia Archive team in particular 3. Gaia Science Operations Centre at ESAC 4. DPAC - Data Processing and Analysis Consortium 5. GAVIP + AVIs teams C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 27
28 This image cannot currently be displayed. IVOA Specifications ADQL: Astronomical Data Query Language Language used to query data : Table Access Protocol A protocol to access tables that contain the data UWS: Universal Worker Service Pattern A jobs scheduler/handler to manage data queries VOSpace: Interface to distributed storage A virtual storage system (a VO dropbox++ ) (Some extensions required to fulfil science needs) SAMP: Simple Application Messaging Protocol A protocol for applications to inter connect amongst them C.Arviset Gaia Archive's big data challenges BIDS /03/2016 Slide 28
The Gaia Archive. Center Forum, Heidelberg, June 10-11, 2013. Stefan Jordan. The Gaia Archive, COSADIE Astronomical Data
The Gaia Archive Astronomisches Rechen-Institut am Zentrum für Astronomie der Universität Heidelberg http://www.stefan-jordan.de 1 2 Gaia 2013-2018 and beyond Progress with Gaia 3 HIPPARCOS Gaia accuracy
More informationArchival Science with the ESAC Science Archives and Virtual Observatory
Archival Science with the ESAC Science Archives and Virtual Observatory Deborah Baines Science Archives and VO Team Scientist European Space Agency (ESA) European Space Astronomy Centre (ESAC) Science
More informationESA Sky. Bruno Merín Sara Nieto Elena Racero Jesús Salgado María Henar Sarmiento Pilar de Teodoro
ESA Sky Deborah Baines Javier Castellanos Fabrizio Giordano Juan González Raúl Gutiérrez Belén López Martí Bruno Merín Sara Nieto Elena Racero Jesús Salgado María Henar Sarmiento Pilar de Teodoro Thanks
More informationExploring Gaia data with TOPCAT and the Virtual Observatory
Exploring Gaia data with TOPCAT and the Virtual Observatory Mark Taylor (University of Bristol) Gaia and the Unseen Brown Dwarf Question GREAT-ESF Workshop Torino University 26 March 2014 $Id: tcvo.tex,v
More informationCU9 Science Enabling Applications Development Work Package Software Requirements Specification (WP970)
Science Enabling Applications Development Work Package Software Requirements Specification (WP970) prepared by: approved by: reference: issue: revision: 1 X. Luri, P.M. Marrese, F.Julbe, H. Enke, N. Walton,
More informationData Lab System Architecture
Data Lab System Architecture Data Lab Context Data Lab Architecture Astronomer s Desktop Web Page Cmdline Tools Legacy Apps User Code User Mgmt Data Lab Ops Monitoring Presentation Layer Authentication
More informationTHE US NATIONAL VIRTUAL OBSERVATORY. IVOA WebServices. William O Mullane The Johns Hopkins University
THE US NATIONAL VIRTUAL OBSERVATORY IVOA WebServices William O Mullane The Johns Hopkins University 1 What exactly is a WS? FROM http://dev.w3.org/cvsweb/~checkout~/2002/ws/arch/wsa/wd-wsaarch.html#whatisws
More informationData Lab Operations Concepts
Data Lab Operations Concepts 1 Introduction This talk will provide an overview of Data Lab components to be implemented Core infrastructure User applications Science Capabilities User Interfaces The scope
More informationThe Planck Legacy Archive: current status, contents and future development. Xavier Dupac ESA-ESAC Villanueva de la Cañada, Spain
The Planck Legacy Archive: current status, contents and future development Xavier Dupac ESA-ESAC Villanueva de la Cañada, Spain Outline Introduction Schedule Scientific contents of the PLA Additional contents
More informationASKAP Science Data Archive: Users and Requirements CSIRO ASTRONOMY AND SPACE SCIENCE (CASS)
ASKAP Science Data Archive: Users and Requirements CSIRO ASTRONOMY AND SPACE SCIENCE (CASS) Jessica Chapman, Data Workshop March 2013 ASKAP Science Data Archive Talk outline Data flow in brief Some radio
More informationLecture 5b: Data Mining. Peter Wheatley
Lecture 5b: Data Mining Peter Wheatley Data archives Most astronomical data now available via archives Raw data and high-level products usually available Data reduction software often specific to individual
More informationObserver Access to the Cherenkov Telescope Array
Observer Access to the Cherenkov Telescope Array IRAP, Toulouse, France E-mail: jknodlseder@irap.omp.eu V. Beckmann APC, Paris, France E-mail: beckmann@apc.in2p3.fr C. Boisson LUTh, Paris, France E-mail:
More informationSoftware challenges in the implementation of large surveys: the case of J-PAS
Software challenges in the implementation of large surveys: the case of J-PAS 1/21 Paulo Penteado - IAG/USP pp.penteado@gmail.com http://www.ppenteado.net/ast/pp_lsst_201204.pdf (K. Taylor) (A. Fernández-Soto)
More informationThe Murchison Widefield Array Data Archive System. Chen Wu Int l Centre for Radio Astronomy Research The University of Western Australia
The Murchison Widefield Array Data Archive System Chen Wu Int l Centre for Radio Astronomy Research The University of Western Australia Agenda Dataflow Requirements Solutions & Lessons learnt Open solution
More informationMAST: The Mikulski Archive for Space Telescopes
MAST: The Mikulski Archive for Space Telescopes Richard L. White Space Telescope Science Institute 2015 April 1, NRC Space Science Week/CBPSS A model for open access The NASA astrophysics data archives
More informationEChO Ground Segment: Overview & Science Operations Assumptions
EChO Ground Segment: Overview & Science Operations Assumptions Matthias Ehle & the Science Ground Segment Working Group EChO Science Operations Study Manager ESA-ESAC, Madrid Science Operations Department/Division
More informationand the VO-Science Francisco Jiménez Esteban Suffolk University
The Spanish-VO and the VO-Science Francisco Jiménez Esteban CAB / SVO (INTA-CSIC) Suffolk University The Spanish-VO (SVO) IVOA was created in June 2002 with the mission to facilitate the international
More informationMultidimensional Data in the Virtual Observatory
IX Reunión Científica de la SEA Madrid- 15/09/2010 Red Temática SVO Multidimensional Data in the Virtual Observatory José Enrique Ruiz Grupo AMIGA Instituto de Astrofísica de Andalucía CSIC Contextual
More informationLSST Resources for Data Analysis
LSST Resources for the Community Lynne Jones University of Washington/LSST 1 Data Flow Nightly Operations : (at base facility) Each 15s exposure = 6.44 GB (raw) 2x15s = 1 visit 30 TB / night Generates
More informationCAUP s Astronomical Instrumentation and Surveys
CAUP s Astronomical Instrumentation and Surveys CENTRO DE ASTROFÍSICA DA UNIVERSIDADE DO PORTO www.astro.up.pt Sérgio A. G. Sousa Team presentation sousasag@astro.up.pt CAUP's Astronomical Instrumentation
More informationThe Virtual Observatory in Action
The Virtual Observatory in Action VO drivers VO vision VO progress World AstroGrid VO Desktop demo Oxford erc Andy Lawrence Jan 2008 VO drivers : science science services several trends lead to science
More informationOrganization of VizieR's Catalogs Archival
Organization of VizieR's Catalogs Archival Organization of VizieR's Catalogs Archival Table of Contents Foreword...2 Environment applied to VizieR archives...3 The archive... 3 The producer...3 The user...3
More informationAustralian Virtual Observatory
Australian Virtual Observatory International Astronomical Union GA 2003 Joint Discussion 08 17th-18th July 2003 Sydney David Barnes The University of Melbourne Our take on virtual observatories bring legacy
More informationThe ISO Data Archive
the iso data archive The ISO Data Archive C. Arviset & T. Prusti ISO Data Centre, ESA Directorate of Scientific Programmes, Villafranca, Spain Introduction ISO was the world s first true orbiting astronomical
More informationHow To Understand And Understand The Science Of Astronomy
Introduction to the VO Christophe.Arviset@esa.int ESAVO ESA/ESAC Madrid, Spain The way Astronomy works Telescopes (ground- and space-based, covering the full electromagnetic spectrum) Observatories Instruments
More informationLSST and the Cloud: Astro Collaboration in 2016 Tim Axelrod LSST Data Management Scientist
LSST and the Cloud: Astro Collaboration in 2016 Tim Axelrod LSST Data Management Scientist DERCAP Sydney, Australia, 2009 Overview of Presentation LSST - a large-scale Southern hemisphere optical survey
More informationThe NOAO Science Archive and NVO Portal: Information & Guidelines
The NOAO Science Archive and NVO Portal: Information & Guidelines Mark Dickinson, 14 March 2008 Orig. document by Howard Lanning & Mark Dickinson Thank you for your help testing the new NOAO Science Archive
More informationData centres in the. Virtual Observatory. F. Genova, IVOA Small Project meeting, September 2006 1
Data centres in the Virtual Observatory F. Genova, IVOA Small Project meeting, September 2006 1 VO status (1) Many national projects Very different contexts/financing agencies A really world-wide, global
More informationGalaxy Survey data analysis using SDSS-III as an example
Galaxy Survey data analysis using SDSS-III as an example Will Percival (University of Portsmouth) showing work by the BOSS galaxy clustering working group" Cosmology from Spectroscopic Galaxy Surveys"
More informationThe Mantid Project. The challenges of delivering flexible HPC for novice end users. Nicholas Draper SOS18
The Mantid Project The challenges of delivering flexible HPC for novice end users Nicholas Draper SOS18 What Is Mantid A framework that supports high-performance computing and visualisation of scientific
More informationAstro Runtime An API for the Virtual Observatory
A PPARC funded project Astro Runtime An API for the Virtual Observatory Noel Winstanley - Jodrell Bank Observatory Astro Runtime A library of virtual-observatory functions and clients. integrates VO standards,
More informationCoSADIE Data Centre Forum. Summary and conclusions
CoSADIE Data Centre Forum Summary and conclusions A forum for the data centre community Tell the story of what they do and of their relationship with the VO Know each other better Community building Communication
More informationResearch Data Storage, Sharing, and Transfer Options
Research Data Storage, Sharing, and Transfer Options Principal investigators should establish a research data management system for their projects including procedures for storing working data collected
More informationBringing Big Data to the Solar System. Paulo Penteado Northern Arizona University, Flagstaff (visiting David Trilling)
Bringing Big Data to the Solar System Paulo Penteado Northern Arizona University, Flagstaff (visiting David Trilling) pp.penteado@gmail.com http://www.ppenteado.net What is Big Data and why do we care?
More informationData Centre Alliance - Science
Data Centre Alliance - Science Mark ALLEN, Jonathan TEDDS & DCA IST Nov 2008 IST Internal Science Team Activities of the IST are being rounded off for end of project Coordinated by IST telecon IST: T2-DCA-IST
More informationCross-Matching Very Large Datasets
1 Cross-Matching Very Large Datasets María A. Nieto-Santisteban, Aniruddha R. Thakar, and Alexander S. Szalay Johns Hopkins University Abstract The primary mission of the National Virtual Observatory (NVO)
More informationScience@ESA vodcast series. Script for Episode 6 Charting the Galaxy - from Hipparcos to Gaia
Science@ESA vodcast series Script for Episode 6 Charting the Galaxy - from Hipparcos to Gaia Available to download from http://sci.esa.int/gaia/vodcast Hello, I m Rebecca Barnes and welcome to the Science@ESA
More informationAstronomical Data Analysis Software & Systems XVI
Astronomical Data Analysis Software & Systems XVI 15-18 October 2006 Tucson, Arizona, USA Events ADASS XVI Today Calendar Conference Schedule Meeting Program Recent News Birds of a Feather Banquet Conference
More informationPRESENTATION SPACE MISSIONS
GENERAL PRESENTATION SPACE MISSIONS CONTENTS 1. Who we are 2. What we do 3. Space main areas 4. Space missions Page 2 WHO WE ARE GENERAL Multinational conglomerate founded in 1984 Private capital Offices
More informationIntro to Sessions 3 & 4: Data Management & Data Analysis. Bob Mann Wide-Field Astronomy Unit University of Edinburgh
Intro to Sessions 3 & 4: Data Management & Data Analysis Bob Mann Wide-Field Astronomy Unit University of Edinburgh 1 Outline Data Management Issues Alternatives to monolithic RDBMS model Intercontinental
More informationTo begin, visit this URL: http://www.ibm.com/software/rational/products/rdp
Rational Developer for Power (RDp) Trial Download and Installation Instructions Notes You should complete the following instructions using Internet Explorer or Firefox with Java enabled. You should disable
More informationFRACTAL SYSTEM & PROJECT SUITE: ENGINEERING TOOLS FOR IMPROVING DEVELOPMENT AND OPERATION OF THE SYSTEMS. (Spain); ABSTRACT 1.
FRACTAL SYSTEM & PROJECT SUITE: ENGINEERING TOOLS FOR IMPROVING DEVELOPMENT AND OPERATION OF THE SYSTEMS A. Pérez-Calpena a, E. Mujica-Alvarez, J. Osinde-Lopez a, M. García-Vargas a a FRACTAL SLNE. C/
More informationD. Briukhov, L. Kalinichenko, i D. Martynov, N. Skvortsov, S.Stupnikov, A. Vovchenko, V. Zakharov, O. Zhelenkova
APPLICATION DRIVEN MEDIATION MIDDLEWARE GRID-INFRASTRUCTUREINFRASTRUCTURE FOR PROBLEM SOLVING OVER MULTIPLE HETEROGENEOUS DISTRIBUTED INFORMATION RESOURCES The Third International Conference "Distributed
More informationALMA Technical Support. George J. Bendo UK ALMA Regional Centre Node University of Manchester
ALMA Technical Support George J. Bendo UK ALMA Regional Centre Node University of Manchester Overview ALMA organisation and services Websites o Web portal o Helpdesk Documentation Software o CASA o Observing
More informationCADC and CANFAR: Extending the role of the data centre. Séverin Gaudet Canadian Astronomy Data Centre
CADC and CANFAR: Extending the role of the data centre Séverin Gaudet Canadian Astronomy Data Centre February 2012 Canadian Astronomy Data Centre Heterogeneous collection: Multiple missions, facilities
More informationMySQL Enterprise Monitor
MySQL Enterprise Monitor Lynn Ferrante Principal Sales Consultant 1 Program Agenda MySQL Enterprise Monitor Overview Architecture Roles Demo 2 Overview 3 MySQL Enterprise Edition Highest Levels of Security,
More informationDeployment of Intersystems Caché with GUMS on Amazon EC2
Deployment of Intersystems Caché with GUMS on Amazon EC2 prepared by: Daniel Tapiador affiliation : ESAC Science Archives and VO Team approved by: GAP reference: issue: 0D revision: 0 date: 2011-10-18
More informationCASA Analysis and Visualization
CASA Analysis and Visualization Synthesis... 1 Current Status... 1 General Goals and Challenges... 3 Immediate Goals... 5 Harnessing Community Development... 7 Synthesis We summarize capabilities and challenges
More informationLiving Requirements Document: Sniffit
Living Requirements Document: Sniffit RFID locator system Andrew Pang Braulio Fonseca Enrique Gutierrez Nader Khalil Sohan Shah Victor Porter Introduction Sniffit is a handy tracking application that helps
More informationBig Data, Cloud & Virtualization
Big Data, Cloud & Virtualization Tokyo, 2014 Vik Nagjee Product Manager, Database Platforms Big Data 1 What s Big about {Big} Data? The 3 V s Volume Variety Velocity The {Big} Data Challenge Image credit:
More informationVirtual machine W4M- Galaxy: Installation guide
Virtual machine W4M- Galaxy: Installation guide Christophe Duperier August, 6 th 2014 v03 This document describes the installation procedure and the functionalities provided by the W4M- Galaxy virtual
More informationVAO Single Sign-on with OpenID
VAO Single Sign-on with OpenID Ray Plante VAO NCSA 20 October 2011 IVOA Interoperability 20 Meeting October -- Pune 2011 IVOA Interoperability Meeting -- Pune Common Identities across the VO VAO Single
More informationMac OS X Security Checklist:
Mac OS X Security Checklist: Implementing the Center for Internet Security Benchmark for OS X Recommendations for securing Mac OS X The Center for Internet Security (CIS) benchmark for OS X is widely regarded
More informationManaging Large Imagery Databases via the Web
'Photogrammetric Week 01' D. Fritsch & R. Spiller, Eds. Wichmann Verlag, Heidelberg 2001. Meyer 309 Managing Large Imagery Databases via the Web UWE MEYER, Dortmund ABSTRACT The terramapserver system is
More informationData Management Plan Extended Baryon Oscillation Spectroscopic Survey
Data Management Plan Extended Baryon Oscillation Spectroscopic Survey Experiment description: eboss is the cosmological component of the fourth generation of the Sloan Digital Sky Survey (SDSS-IV) located
More informationSGS System Requirements Review
SGS System Requirements Review John Hoar Lausanne 09/06/2015 Quick recap The Science Ground Segment transforms measurements made by the Euclid instruments into data products ready for scientific use. This
More informationUsage statistics and archiving process of VizieR data in the VO context
Usage statistics and archiving process of VizieR data in the VO context VO Implementation status VO Implementation status Application VOTable (1.1 1.2 1.3) Semantic Data Access Layer Data Model MOC SAMP
More informationUnited Nations - Nations Unies. COSPAR Symposium. Measuring the Universe. Looking Back in Time with Modern Astronomy. Monday, 2nd February 2015
United Nations - Nations Unies COSPAR Symposium Measuring the Universe Looking Back in Time with Modern Astronomy Monday, 2nd February 2015 15:00 18:00 Conference Rooms M1, Building M, Vienna International
More informationData Mining with Hadoop at TACC
Data Mining with Hadoop at TACC Weijia Xu Data Mining & Statistics Data Mining & Statistics Group Main activities Research and Development Developing new data mining and analysis solutions for practical
More informationBENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next
More informationThe Rise of Industrial Big Data. Brian Courtney General Manager Industrial Data Intelligence
The Rise of Industrial Big Data Brian Courtney General Manager Industrial Data Intelligence Agenda Introduction Big Data for the industrial sector Case in point: Big data saves millions at GE Energy Seeking
More informationConcepts and Architecture of Grid Computing. Advanced Topics Spring 2008 Prof. Robert van Engelen
Concepts and Architecture of Grid Computing Advanced Topics Spring 2008 Prof. Robert van Engelen Overview Grid users: who are they? Concept of the Grid Challenges for the Grid Evolution of Grid systems
More informationSURFsara Data Services
SURFsara Data Services SUPPORTING DATA-INTENSIVE SCIENCES Mark van de Sanden The world of the many Many different users (well organised (international) user communities, research groups, universities,
More informationA Study of Data Management Technology for Handling Big Data
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 9, September 2014,
More informationState of SIEM Challenges, Myths & technology Landscape 4/21/2013 1
State of SIEM Challenges, Myths & technology Landscape 4/21/2013 1 Introduction What s in a name? SIEM? SEM? SIM? Technology Drivers Challenges & Technology Overview Deciding what s right for you Worst
More informationDiamondStream Data Security Policy Summary
DiamondStream Data Security Policy Summary Overview This document describes DiamondStream s standard security policy for accessing and interacting with proprietary and third-party client data. This covers
More informationDCA QUESTIONNAIRE V0.1-1 INTRODUCTION AND IDENTIFICATION OF THE DATA CENTRE
DCA QUESTIONNAIRE V0.1-1 INTRODUCTION AND IDENTIFICATION OF THE DATA CENTRE Introduction - The EuroVO-DCA Census questionnaire The Euro-VO Data Centre Alliance (http://www.euro-vo.org/pub/dca/overview.html)
More informationSSL VPN. Virtual Appliance Installation Guide. Virtual Private Networks
SSL VPN Virtual Appliance Installation Guide Virtual Private Networks C ONTENTS Introduction... 2 Installing the Virtual Appliance... 2 Configuring Appliance Operating System Settings... 3 Setting up the
More informationOracle Big Data SQL Technical Update
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
More informationData providers technical feedback
Data providers technical feedback DCA WP 3.2 (LU) Report May 2007 Anita Richards (JBO) & Jonathan Tedds (LU) Terms of reference Activities up to 1 st PCT meeting Thurs Session 1 Plans Thurs Session 3 WP
More informationKarl Lum Partner, LabKey Software klum@labkey.com. Evolution of Connectivity in LabKey Server
Karl Lum Partner, LabKey Software klum@labkey.com Evolution of Connectivity in LabKey Server Connecting Data to LabKey Server Lowering the barrier to connect scientific data to LabKey Server Increased
More informationTDRS / MUST. and. what it might do for you
TDRS / MUST and what it might do for you Dr. Marcus G. F. Kirsch XMM-Newton Deputy Spacecraft Operations Manager with Inputs from José-Antonio Martínez nez-heras, Black Hat S.L., Spain European Space Agency
More informationDatabricks. A Primer
Databricks A Primer Who is Databricks? Databricks vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache Spark, a powerful
More informationA New Data Visualization and Analysis Tool
Title: A New Data Visualization and Analysis Tool Author: Kern Date: 22 February 2013 NRAO Doc. #: Version: 1.0 A New Data Visualization and Analysis Tool PREPARED BY ORGANIZATION DATE Jeff Kern NRAO 22
More informationNEXT GENERATION ARCHIVE MIGRATION TOOLS
NEXT GENERATION ARCHIVE MIGRATION TOOLS Cloud Ready, Scalable, & Highly Customizable - Migrate 6.0 Ensures Faster & Smarter Migrations EXECUTIVE SUMMARY Data migrations and the products used to perform
More informationGEOCOMPUTATIONS AND RELATED WEB SERVICES
GEOCOMPUTATIONS AND RELATED WEB SERVICES J. A. Rod Blais Dept. of Geomatics Engineering Pacific Institute for the Mathematical Sciences University of Calgary, Calgary, Alberta T2N 1N4 blais@ucalgary.ca
More informationUsing the Parkes Pulsar Data Archive
JART http://www.jart.ac.cn Using the Parkes Pulsar Data Archive J. Khoo 1, G. Hobbs 1, R. N. Manchester 1, D. Miller 2, J. Dempsey 2 1 CSIRO Astronomy and Space Science, Australia Telescope National Facility,
More informationScaling Big Data Mining Infrastructure: The Smart Protection Network Experience
Scaling Big Data Mining Infrastructure: The Smart Protection Network Experience 黃 振 修 (Chris Huang) SPN 主 動 式 雲 端 截 毒 技 術 架 構 師 About Me SPN 主 動 式 雲 端 截 毒 技 術 架 構 師 SPN Hadoop 基 礎 運 算 架 構 師 Hadoop in Taiwan
More informationCARRIOTS TECHNICAL PRESENTATION
CARRIOTS TECHNICAL PRESENTATION Alvaro Everlet, CTO alvaro.everlet@carriots.com @aeverlet Oct 2013 CARRIOTS TECHNICAL PRESENTATION 1. WHAT IS CARRIOTS 2. BUILDING AN IOT PROJECT 3. DEVICES 4. PLATFORM
More informationBig Data and evolution of the Ground System EO ENG and the imarine case
Big Data and evolution of the Ground System EO ENG and the imarine case Andrea Manieri Engineering R&D Lab. Rome, 26/11/2013 1 1 AGENDA The Big data challenges seen from the space Engineering and (some)
More informationOn-line supplement to manuscript Galaxy for collaborative analysis of ENCODE data: Making large-scale analyses biologist-friendly
On-line supplement to manuscript Galaxy for collaborative analysis of ENCODE data: Making large-scale analyses biologist-friendly DANIEL BLANKENBERG, JAMES TAYLOR, IAN SCHENCK, JIANBIN HE, YI ZHANG, MATTHEW
More informationSoftware Development for Virtual Observatories
Software Development for Virtual Observatories BRAVO Workshop February 2007 Rafael Santos 1 Warning! This presentation is biased. I'll talk about VO software development, including some under the hood
More informationDatabricks. A Primer
Databricks A Primer Who is Databricks? Databricks was founded by the team behind Apache Spark, the most active open source project in the big data ecosystem today. Our mission at Databricks is to dramatically
More informationClassroom Exercise ASTR 390 Selected Topics in Astronomy: Astrobiology A Hertzsprung-Russell Potpourri
Classroom Exercise ASTR 390 Selected Topics in Astronomy: Astrobiology A Hertzsprung-Russell Potpourri Purpose: 1) To understand the H-R Diagram; 2) To understand how the H-R Diagram can be used to follow
More informationArchival of raw and analysed radar data at EISCAT and worldwide
Archival of raw and analysed radar data at EISCAT and worldwide Carl-Fredrik Enell, EISCAT Scientific Association COOPEUS workshop and EGI-CC kickoff, 11 March 2015 C-F Enell, EISCAT Radar data archival
More informationData-intensive HPC: opportunities and challenges. Patrick Valduriez
Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,
More informationOutcomes of the CDS Technical Infrastructure Workshop
Outcomes of the CDS Technical Infrastructure Workshop Baudouin Raoult Baudouin.raoult@ecmwf.int Funded by the European Union Implemented by Evaluation & QC function C3S architecture from European commission
More informationTHE CCLRC DATA PORTAL
THE CCLRC DATA PORTAL Glen Drinkwater, Shoaib Sufi CCLRC Daresbury Laboratory, Daresbury, Warrington, Cheshire, WA4 4AD, UK. E-mail: g.j.drinkwater@dl.ac.uk, s.a.sufi@dl.ac.uk Abstract: The project aims
More informationWhy a single source for assets should be. the backbone of all your digital activities
Why a single source for assets should be the backbone of all your digital activities Navigating in the digital landscape The old era of traditional marketing has long passed. Today, customers expect to
More informationResearch Data Storage, Sharing, and Transfer Options
Research Data Storage, Sharing, and Transfer Options Principal investigators should establish a research data management system for their projects including procedures for storing working data collected
More informationIntegrated Performance Monitoring
Integrated Performance Monitoring JENNIFER provides comprehensive and integrated performance monitoring through its many dashboard views, which include Realuser Monitoring and Real-time Topology. USING
More informationOptimizing IT Deployment Issues
Optimizing IT Deployment Issues Trends and Challenges for Engineering Simulation Barbara Hutchings barbara.hutchings@ansys.com 1 Outline Deployment Challenges and Trends Extreme scale up and scale out
More informationCanadian Astronomy Data Centre. Séverin Gaudet David Schade Canadian Astronomy Data Centre
Canadian Astronomy Data Centre Séverin Gaudet David Schade Canadian Astronomy Data Centre Data Activities in Astronomy Features of the astronomy data landscape Multi-wavelength datasets are increasingly
More informationSOA, case Google. Faculty of technology management 07.12.2009 Information Technology Service Oriented Communications CT30A8901.
Faculty of technology management 07.12.2009 Information Technology Service Oriented Communications CT30A8901 SOA, case Google Written by: Sampo Syrjäläinen, 0337918 Jukka Hilvonen, 0337840 1 Contents 1.
More informationOutline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging
Outline High Performance Computing (HPC) Towards exascale computing: a brief history Challenges in the exascale era Big Data meets HPC Some facts about Big Data Technologies HPC and Big Data converging
More informationWOS Cloud. ddn.com. Personal Storage for the Enterprise. DDN Solution Brief
DDN Solution Brief Personal Storage for the Enterprise WOS Cloud Secure, Shared Drop-in File Access for Enterprise Users, Anytime and Anywhere 2011 DataDirect Networks. All Rights Reserved DDN WOS Cloud
More informationPure1 Manage User Guide
User Guide 11/2015 Contents Overview... 2 Pure1 Manage Navigation... 3 Pure1 Manage - Arrays Page... 5 Card View... 5 Expanded Card View... 7 List View... 10 Pure1 Manage Replication Page... 11 Pure1
More informationGoogle Cloud Data Platform & Services. Gregor Hohpe
Google Cloud Data Platform & Services Gregor Hohpe All About Data We Have More of It Internet data more easily available Logs user & system behavior Cheap Storage keep more of it 3 Beyond just Relational
More informationSITools2 as VO service provider: an example with Herschel at IDOC (Integrated Data and Operation Center)
SITools2 as VO service provider: an example with Herschel at IDOC (Integrated Data and Operation Center) SITools 2 SITools2 is a CNES generic tool performed by a joint effort between CNES and scienefic
More informationMONITORING RED HAT GLUSTER SERVER DEPLOYMENTS With the Nagios IT infrastructure monitoring tool
TECHNOLOGY DETAIL MONITORING RED HAT GLUSTER SERVER DEPLOYMENTS With the Nagios IT infrastructure monitoring tool INTRODUCTION Storage system monitoring is a fundamental task for a storage administrator.
More information