EUDAT. Towards a pan-european Collaborative Data Infrastructure

Size: px
Start display at page:

Download "EUDAT. Towards a pan-european Collaborative Data Infrastructure"

Transcription

1 EUDAT Towards a pan-european Collaborative Data Infrastructure Damien Lecarpentier CSC-IT Center for Science, Finland EISCAT User Meeting, Uppsala,6 May 2013

2 2

3 Exponential growth Data trends Zettabytes Exabytes Petabytes Terabytes Gigabytes Increasing complexity and variety Where to store it? How to find it? How to make the most of it? 3

4 Data Curation Trust Collaborative Data Infrastructure -A framework for the future? - Data Generators Users User functionalities, data capture & transfer, virtual research environments Community Support Services Data discovery & navigation, workflow generation, annotation, interpretability Common Data Services Persistent storage, identification, authenticity, workflow execution, mining

5 Data Centers and Communities 5

6 Five research communities on Board EPOS: European Plate Observatory System CLARIN: Common Language Resources and Technology Infrastructure ENES: Service for Climate Modelling in Europe LifeWatch: Biodiversity Data and Observatories VPH: The Virtual Physiological Human All share common challenges: Reference models and architectures Persistent data identifiers Metadata management Distributed data sources Data interoperability 6

7 Communities Data Centers

8 Safe Replication Service Robust, safe and highly available data replication service for small- and medium- sized repositories To guard against data loss in long-term archiving and preservation To optimize access for user from different regions To bring data closer to powerful computers for compute-intensive analysis PIDs Policy rules EUDAT CDI Domain of registered data 8

9 Data Staging Service Support researchers in transferring large data collections from EUDAT storage to HPC facilities Reliable, efficient, and easy-to-use tools to manage data transfers Provide the means to reingest computational results back into the EUDAT infrastructure EUDAT CDI Domain of registered data PRACE HPC HPC 9

10 Simple Store Service Allow registered users to upload long tail data into the EUDAT store Enable sharing objects and collections with other researchers Utilise other EUDAT services to provide reliability and data retention Simple upload Simple metadata PID registration EUDAT CDI Domain of registered data 10

11 Metadata Service Easily find collections of scientific data generated either by various communities or via EUDAT services Access those data collections through the given references in the metadata to the relevant data stores Europeana of scientific data EUDAT CDI Domain of registered data 11

12 Building Blocks of the CDI Metadata Catalogue Aggregated EUDAT metadata domain. Data inventory Data Staging Safe Replication Simple Store Dynamic replication to HPC workspace for processing Data curation and access optimization Researcher data store (simple upload, share and access) AAI Network of trust among authentication and authorization actors

13 EUDAT Sites general data centres community centres representing all the associated community data centres

14 Resource coordination and site monitoring 14

15 New services? Real-time data handling: handling of (near) real time data streams generated by instruments (e.g. earth science) Semantic Annotation: a generic cross-disciplinary solution for improving data sets by establishing references to recognized knowledge sources Crowd Sourcing Web Services Memento Community Interests 15

16 Real Time Data Real time data is every where! Scientific Machines generating data Particle accelerators: LHC Astronomy telescopes: ESO VLT, Lofar, SKA Life sciences: DNA sequencing Different types of sensors: seismology sensors, radar images, gps tracking HPC processing! Other type generated data Web generated data: twitter feeds, weather/traffic websites, stock exchange,.. There is more data on the internet then storage devices to store the data! How to handle this data which is in flux How can data objects or part of data objects be identified 16

17 EPOS s use case data acquisition Seismologists community scenario: Hundreds of stations in Italy, ten of thousands in the world. Real-time acquisition of data as seismological waveforms. 17

18 EPOS s use case data acquisition 5 Arrive between minutes hours days GAP 6 7 GAP 9 Sensor data stream is continuous Data objects are organized by sensor, a file, and time series, data blocks with in a file Sensor data does not always arrive sequentially, but data is recorded sequentially, gaps are created Data objects are replicated (Safe Replication) to other centers When a data object is finished, register with a PID What to do when data objects are updated: replicate again and update PID! Can you reference a separate (part of a) time serie 18

19 EUDAT and EISCAT Several interactions over the last few months EUDAT User forum in London (2013) EISCAT User Meeting, NeiC conference, etc. Interest from EUDAT to interact with EISCAT Mature and organized community requirements easier to identify, possibility to test solutions effectively, sustainability, etc. EISCAT 3D RI new data and computing challenges. Big Data requires strategic changes and new partnerships! EUDAT could also use EISCAT as a use case to develop its services (e.g. Real-Time Data) Several services developed by EUDAT seem to fit EISCAT 19

20 Partnership opportunities EUDAT and EISCAT Storage Long-term or temporary storage on top of EISCAT s capacity Facilitating access to data sets for distributed users Computing Access to HPC machines to perform simulations Real-Time Data How to move forward? Join a pilot now! Free resources and support, exchange of expertise, test your infrastructure, etc. Join the discussion on new services: express requirements, join a working group (e.g. RTD) 20

21 21

22 Welcome to the 2nd EUDAT Conference! Thank you! 22

EUDAT. Towards a pan-european Collaborative Data Infrastructure. Willem Elbers

EUDAT. Towards a pan-european Collaborative Data Infrastructure. Willem Elbers EUDAT Towards a pan-european Collaborative Data Infrastructure Willem Elbers EUDAT / MPI-TLA Focus meeting: Data repositories SURF, Utrecht March 3, 2014 Outline EUDAT project EUDAT services Summary and

More information

EUDAT - Open Data Services for Research

EUDAT - Open Data Services for Research EUDAT - Open Data Services for Research Per Öster 05.03.2015 CSC at a Glance Founded in 1971 as a technical support unit for Univac 1108 Connected Finland to the Internet in 1988 Reorganized as a company,

More information

The challenge of managing research data. Axel Berg

The challenge of managing research data. Axel Berg The challenge of managing research data Axel Berg Context The data deluge cannot be stopped Without adequate data management: - the ever-growing amounts and complexity of data will be non-controllable

More information

European Data Infrastructure - EUDAT Data Services & Tools

European Data Infrastructure - EUDAT Data Services & Tools European Data Infrastructure - EUDAT Data Services & Tools Dr. Ing. Morris Riedel Research Group Leader, Juelich Supercomputing Centre Adjunct Associated Professor, University of iceland BDEC2015, 2015-01-28

More information

How To Build An Open Source Data Infrastructure

How To Build An Open Source Data Infrastructure EUDAT Collaborative Data Infrastructure Towards the convergence of Compute, Data, Knowledge and Scientific Instruments Giuseppe Fiameni CINECA www.eudat.eu EUDAT receives funding from the European Union's

More information

Interaction with other IT projects: EUDAT2020, VLDATA, ENVRI PLUS,

Interaction with other IT projects: EUDAT2020, VLDATA, ENVRI PLUS, Interaction with other IT projects: EUDAT2020, VLDATA, ENVRI PLUS, A. Spinuso, L. Trani, A. Strollo and D. Bailo EPOS PP final meeting, Rome, 22-24 October 2014 OUTLINE WG1 and the EIDA use case A modular

More information

A public-private partnership building a multidisciplinary cloud platform for data intensive science

A public-private partnership building a multidisciplinary cloud platform for data intensive science This document produced by Members of the Helix Nebula consortium is licensed under a Creative Commons Attribution 3.0 Unported License. Permissions beyond the scope of this license may be available at

More information

data infrastructures framework for action for H2020

data infrastructures framework for action for H2020 data infrastructures framework for action for H2020 Event Open Access Policy in Portugal Lisbon, 17 June 2013 Carlos Morais Pires European Commission e-infrastructures, DG CNECT.C1 Author s views do not

More information

Data Management using irods

Data Management using irods Data Management using irods Fundamentals of Data Management September 2014 Albert Heyrovsky Applications Developer, EPCC a.heyrovsky@epcc.ed.ac.uk 2 Course outline Why talk about irods? What is irods?

More information

Impact of Big Data in Oil & Gas Industry. Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India.

Impact of Big Data in Oil & Gas Industry. Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India. Impact of Big Data in Oil & Gas Industry Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India. New Age Information 2.92 billions Internet Users in 2014 Twitter processes 7 terabytes

More information

Workprogramme 2014-15

Workprogramme 2014-15 Workprogramme 2014-15 e-infrastructures DCH-RP final conference 22 September 2014 Wim Jansen einfrastructure DG CONNECT European Commission DEVELOPMENT AND DEPLOYMENT OF E-INFRASTRUCTURES AND SERVICES

More information

Federated Authentication and Credential Translation in the EUDAT Collaborative Data Infrastructure

Federated Authentication and Credential Translation in the EUDAT Collaborative Data Infrastructure Federated Authentication and Credential Translation in the EUDAT Collaborative Data Infrastructure Ahmed Shiraz Memon (JSC - DE) Jens Jensen (STFC escience - UK) Ales Cernivec (XLAB - SL) Krzysztof Benedyczak

More information

Pre-Talk Talk. What does ESS look like as more of this CI arrives?

Pre-Talk Talk. What does ESS look like as more of this CI arrives? Cloud Computing Rob Fatland Microsoft Research For the MRC story: http://research.microsoft.com/azure WRF two years and counting: http://weatherservice.cloudapp.net Pre-Talk Talk What does ESS look like

More information

BIG DATA & SOCIAL INNOVATION KENNETH THOMAS, CLIENT MANAGER

BIG DATA & SOCIAL INNOVATION KENNETH THOMAS, CLIENT MANAGER BIG DATA & SOCIAL INNOVATION KENNETH THOMAS, CLIENT MANAGER 1 MAKING THE RIGHT DECISSION AT THE RIGHT PLACE AT THE RIGHT TIME 2 THE DATA MULTIPLIER EFFECT AT WORK BUSINESS DRIVEN HUMAN DRIVEN MACHINE DRIVEN

More information

Elephant Meeting on Big Data Services

Elephant Meeting on Big Data Services 1 di 5 15/04/2016 14:23 Home About News Services Technology Training Contact Tweets by EarthServer_EU Elephant Meeting on Big Data Services Elephant Meeting on Big Data Services EarthServer.eu 2 di 5 15/04/2016

More information

Big Data Challenges for e-science Infrastructure

Big Data Challenges for e-science Infrastructure Big Challenges for e-science Infrastructure Yuri Demchenko, SNE Group, University of Amsterdam AAA-Study Project COINFO2012 Conference 24-25 November 2012, Nanjing, China 23-25 November 2012, Nanjing Big

More information

Workspaces Concept and functional aspects

Workspaces Concept and functional aspects Mitglied der Helmholtz-Gemeinschaft Workspaces Concept and functional aspects A You-tube for science inspired by the High Level Expert Group Report on Scientific Data 21.09.2010 Morris Riedel, Peter Wittenburg,

More information

Italian Scientific Big Data Initiative

Italian Scientific Big Data Initiative Italian Scientific Big Data Initiative Sanzio Bassini Director of Supercomputing Application & Innovation Department S.Bassini@cineca.it Casalecchio di Reno (BO) Via Magnanelli 6/3, 40033 Casalecchio di

More information

Report of the DTL focus meeting on Life Science Data Repositories

Report of the DTL focus meeting on Life Science Data Repositories Report of the DTL focus meeting on Life Science Data Repositories Goal The goal of the meeting was to inform and discuss research data repositories for life sciences. The big data era adds to the complexity

More information

Big Data Processing in Cloud Environments

Big Data Processing in Cloud Environments Big Data in Cloud Environments Satoshi Tsuchiya Yoshinori Sakamoto Yuichi Tsuchimoto Vivian Lee In recent years, accompanied by lower prices of information and communications technology (ICT) equipment

More information

Deploying Multiscale Applications on European e-infrastructures

Deploying Multiscale Applications on European e-infrastructures Deploying Multiscale Applications on European e-infrastructures 04/06/2013 Ilya Saverchenko The MAPPER project receives funding from the EC's Seventh Framework Programme (FP7/2007-2013) under grant agreement

More information

CYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21)

CYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21) CYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21) Goal Develop and deploy comprehensive, integrated, sustainable, and secure cyberinfrastructure (CI) to accelerate research

More information

Digital Preservation Lifecycle Management

Digital Preservation Lifecycle Management Digital Preservation Lifecycle Management Building a demonstration prototype for the preservation of large-scale multi-media collections Arcot Rajasekar San Diego Supercomputer Center, University of California,

More information

PIT adoption for Climate Data Management

PIT adoption for Climate Data Management PIT adoption for Climate Management The German Climate Computing Center (DKRZ) 5th RDA Plenary Stephan Kindermann Overview DKRZ: A climate science service provider services for the international climate

More information

Databases & Data Infrastructure. Kerstin Lehnert

Databases & Data Infrastructure. Kerstin Lehnert + Databases & Data Infrastructure Kerstin Lehnert + Access to Data is Needed 2 to allow verification of research results to allow re-use of data + The road to reuse is perilous (1) 3 Accessibility Discovery,

More information

USGS Community for Data Integration

USGS Community for Data Integration Community of Science: Strategies for Coordinating Integration of Data USGS Community for Data Integration Kevin T. Gallagher USGS Core Science Systems January 11, 2013 U.S. Department of the Interior U.S.

More information

Update on the Twitter Archive At the Library of Congress

Update on the Twitter Archive At the Library of Congress January 2013 Update on the Twitter Archive At the Library of Congress In April, 2010, the Library of Congress and Twitter signed an agreement providing the Library the public tweets from the company s

More information

Global Scientific Data Infrastructures: The Big Data Challenges. Capri, 12 13 May, 2011

Global Scientific Data Infrastructures: The Big Data Challenges. Capri, 12 13 May, 2011 Global Scientific Data Infrastructures: The Big Data Challenges Capri, 12 13 May, 2011 Data-Intensive Science Science is, currently, facing from a hundred to a thousand-fold increase in volumes of data

More information

EUDAT Infrastructure and Service Support

EUDAT Infrastructure and Service Support EUDAT Infrastructure and Service Support Achievements and Current Practice Johannes Reetz 2 nd EUDAT User Forum London, 11-12 March 2013 Topics Status of the Infrastructure (month 16) Operations and Operational

More information

BIG. Big Data Analysis John Domingue (STI International and The Open University) Big Data Public Private Forum

BIG. Big Data Analysis John Domingue (STI International and The Open University) Big Data Public Private Forum Big Data Analysis John Domingue (STI International and The Open University) Project co-funded by the European Commission within the 7th Framework Program (Grant Agreement No. 257943) 1 The Data landscape

More information

escidoc: una plataforma de nueva generación para la información y la comunicación científica

escidoc: una plataforma de nueva generación para la información y la comunicación científica escidoc: una plataforma de nueva generación para la información y la comunicación científica Matthias Razum FIZ Karlsruhe VII Workshop REBIUN sobre proyectos digitales Madrid, October 18 th, 2007 18.10.2007

More information

CLARIN-NL Second Open Call. Jan Odijk CLARIN-NL Call 2 Info-session Amsterdam, 26 Aug 2010

CLARIN-NL Second Open Call. Jan Odijk CLARIN-NL Call 2 Info-session Amsterdam, 26 Aug 2010 CLARIN-NL Second Open Call Jan Odijk CLARIN-NL Call 2 Info-session Amsterdam, 26 Aug 2010 Overview Background Project Types Project Goals Roles Resource Curation Projects Demonstrator Projects CLARIN Centres

More information

Quantum Leap in Open Source Collaboration

Quantum Leap in Open Source Collaboration Quantum Leap in Open Source Collaboration Bridging the gap between campus infrastructures Ton van Alebeek Harold Teunissen et al. April 2012 - #I2SMM12 Cyberinfra in the Netherlands All ICT activities

More information

European Plate Observing System (EPOS) - die europäische Infrastruktur zur nachhaltigen Integrierung von multidisziplinären Daten zur Festen Erde

European Plate Observing System (EPOS) - die europäische Infrastruktur zur nachhaltigen Integrierung von multidisziplinären Daten zur Festen Erde European Plate Observing System (EPOS) - die europäische Infrastruktur zur nachhaltigen Integrierung von multidisziplinären Daten zur Festen Erde Dr. Thomas Hoffmann & Frieder Euteneuer Deutsches GeoForschungsZentrum

More information

The Tonnabytes Big Data Challenge: Transforming Science and Education. Kirk Borne George Mason University

The Tonnabytes Big Data Challenge: Transforming Science and Education. Kirk Borne George Mason University The Tonnabytes Big Data Challenge: Transforming Science and Education Kirk Borne George Mason University Ever since we first began to explore our world humans have asked questions and have collected evidence

More information

Standards for Big Data in the Cloud

Standards for Big Data in the Cloud Standards for Big Data in the Cloud International Cloud Symposium 15/10/2013 Carola Carstens (Project Officer) DG CONNECT, Unit G3 Data Value Chain European Commission Outline 1) Data Value Chain Unit

More information

Thomas Usländer Fraunhofer IITB

Thomas Usländer Fraunhofer IITB ORCHESTRA Day Stresa, 12 December 2007 ORCHESTRA Architecture - Behind the Scenes Thomas Usländer Fraunhofer IITB ORCHESTRA Consortium ORCHESTRA Ambition Analysis Maps Info Centre Archive Control centre

More information

Learning from Big Data in

Learning from Big Data in Learning from Big Data in Astronomy an overview Kirk Borne George Mason University School of Physics, Astronomy, & Computational Sciences http://spacs.gmu.edu/ From traditional astronomy 2 to Big Data

More information

BIG Big Data Public Private Forum

BIG Big Data Public Private Forum DATA STORAGE Martin Strohbach, AGT International (R&D) THE DATA VALUE CHAIN Value Chain Data Acquisition Data Analysis Data Curation Data Storage Data Usage Structured data Unstructured data Event processing

More information

BIG DATA PUBLIC PRIVATE FORUM

BIG DATA PUBLIC PRIVATE FORUM BIG DATA PUBLIC PRIVATE FORUM Agenda 09:00-10:30 9:00-9:20 9:20-9:55 9:55-10:30 The Big Project Results (Session 1) - The Big Project - Welcome and Introduction Nuria De Lama (ATOS Spain) - Key Technology

More information

A Future Scenario of interconnected EO Platforms How will EO data be used in 2025?

A Future Scenario of interconnected EO Platforms How will EO data be used in 2025? A Future Scenario of interconnected EO Platforms How will EO data be used in 2025? ESA UNCLASSIFIED For Official Use European EO data asset Heritage missions Heritage Core GS (data preservation, curation

More information

Data Literacy For All: Astrophysics and Beyond (Astronomy is evidence-based forensic science, thus it is a data & information science)

Data Literacy For All: Astrophysics and Beyond (Astronomy is evidence-based forensic science, thus it is a data & information science) Data Literacy For All: Astrophysics and Beyond (Astronomy is evidence-based forensic science, thus it is a data & information science) Kirk Borne George Mason University, Fairfax, VA www.kirkborne.net

More information

Big Data Just Noise or Does it Matter?

Big Data Just Noise or Does it Matter? Big Data Just Noise or Does it Matter? Opportunities for Continuous Auditing Presented by: Solon Angel Product Manager Servers The CaseWare Group. Founded in 1988. An industry leader in providing technology

More information

Data sharing and Big Data in the physical sciences. 2 October 2015

Data sharing and Big Data in the physical sciences. 2 October 2015 Data sharing and Big Data in the physical sciences 2 October 2015 Content Digital curation: Data and metadata Why consider the physical sciences? Astronomy: Video Physics: LHC for example. Video The Research

More information

CLARIN-NL Third Call: Closed Call

CLARIN-NL Third Call: Closed Call CLARIN-NL Third Call: Closed Call CLARIN-NL launches in its third call a Closed Call for project proposals. This called is only open for researchers who have been explicitly invited to submit a project

More information

White Paper. Version 1.2 May 2015 RAID Incorporated

White Paper. Version 1.2 May 2015 RAID Incorporated White Paper Version 1.2 May 2015 RAID Incorporated Introduction The abundance of Big Data, structured, partially-structured and unstructured massive datasets, which are too large to be processed effectively

More information

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing

More information

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova Using the Grid for the interactive workflow management in biomedicine Andrea Schenone BIOLAB DIST University of Genova overview background requirements solution case study results background A multilevel

More information

Data-Intensive Science and Scientific Data Infrastructure

Data-Intensive Science and Scientific Data Infrastructure Data-Intensive Science and Scientific Data Infrastructure Russ Rew, UCAR Unidata ICTP Advanced School on High Performance and Grid Computing 13 April 2011 Overview Data-intensive science Publishing scientific

More information

Data Intensive Research Initiative for South Africa (DIRISA)

Data Intensive Research Initiative for South Africa (DIRISA) Data Intensive Research Initiative for South Africa (DIRISA) A Reinterpreted Vision A. Vahed 25 November 2014 Outline Background Data Landscape Strategy & Objectives Activities & Outputs Organisational

More information

Data Management Plans - How to Treat Digital Sources

Data Management Plans - How to Treat Digital Sources 1 Data Management Plans - How to Treat Digital Sources The imminent future for repositories and their management Paolo Budroni Library and Archive Services, University of Vienna Tomasz Miksa Secure Business

More information

How To Understand And Understand The Science Of Astronomy

How To Understand And Understand The Science Of Astronomy Introduction to the VO Christophe.Arviset@esa.int ESAVO ESA/ESAC Madrid, Spain The way Astronomy works Telescopes (ground- and space-based, covering the full electromagnetic spectrum) Observatories Instruments

More information

The Big Picture: Information 01100 Technology Revolution, and 1010011 Science in the 21st Century 00101000

The Big Picture: Information 01100 Technology Revolution, and 1010011 Science in the 21st Century 00101000 011 The Big Picture: Information 01100 Technology Revolution, and 1010011 Science in the 21st Century 00101000 Roy & George s Excellent Adventure 1110100011 001001110110110 100101010001011101 Lecture 4

More information

Research Data Alliance: Current Activities and Expected Impact. SGBD Workshop, May 2014 Herman Stehouwer

Research Data Alliance: Current Activities and Expected Impact. SGBD Workshop, May 2014 Herman Stehouwer Research Data Alliance: Current Activities and Expected Impact SGBD Workshop, May 2014 Herman Stehouwer The Vision 2 Researchers and innovators openly share data across technologies, disciplines, and countries

More information

UNIVERSITY OF INFINITE AMBITIONS. MASTER OF SCIENCE COMPUTER SCIENCE DATA SCIENCE AND SMART SERVICES

UNIVERSITY OF INFINITE AMBITIONS. MASTER OF SCIENCE COMPUTER SCIENCE DATA SCIENCE AND SMART SERVICES UNIVERSITY OF INFINITE AMBITIONS. MASTER OF SCIENCE COMPUTER SCIENCE DATA SCIENCE AND SMART SERVICES MASTER S PROGRAMME COMPUTER SCIENCE - DATA SCIENCE AND SMART SERVICES (DS3) This is a specialization

More information

Big Data in the Digital Cultural Heritage

Big Data in the Digital Cultural Heritage Big Data in the Digital Cultural Heritage Antonella Fresa, Promoter Srl DCH-RP Technical Coordinator 1 Table of Content Digitisation of Cultural Heritage Toward an e-infrastructure for Digital Cultural

More information

Cloud and Big Data Standardisation

Cloud and Big Data Standardisation Cloud and Big Data Standardisation EuroCloud Symposium ICS Track: Standards for Big Data in the Cloud 15 October 2013, Luxembourg Yuri Demchenko System and Network Engineering Group, University of Amsterdam

More information

Digital libraries of the future and the role of libraries

Digital libraries of the future and the role of libraries Digital libraries of the future and the role of libraries Donatella Castelli ISTI-CNR, Pisa, Italy Abstract Purpose: To introduce the digital libraries of the future, their enabling technologies and their

More information

Concepts and Architecture of Grid Computing. Advanced Topics Spring 2008 Prof. Robert van Engelen

Concepts and Architecture of Grid Computing. Advanced Topics Spring 2008 Prof. Robert van Engelen Concepts and Architecture of Grid Computing Advanced Topics Spring 2008 Prof. Robert van Engelen Overview Grid users: who are they? Concept of the Grid Challenges for the Grid Evolution of Grid systems

More information

Canadian Astronomy Data Centre. Séverin Gaudet David Schade Canadian Astronomy Data Centre

Canadian Astronomy Data Centre. Séverin Gaudet David Schade Canadian Astronomy Data Centre Canadian Astronomy Data Centre Séverin Gaudet David Schade Canadian Astronomy Data Centre Data Activities in Astronomy Features of the astronomy data landscape Multi-wavelength datasets are increasingly

More information

Simulation and data analysis: Data and data access requirements for ITER Analysis and Modelling Suite

Simulation and data analysis: Data and data access requirements for ITER Analysis and Modelling Suite Simulation and data analysis: Data and data access requirements for ITER Analysis and Modelling Suite P. Strand ITERIS IM Design Team, 7-8 March 2012 1st EUDAT User Forum 1 ITER aim is to demonstrate that

More information

Building next generation consortium services. Part 3: The National Metadata Repository, Discovery Service Finna, and the New Library System

Building next generation consortium services. Part 3: The National Metadata Repository, Discovery Service Finna, and the New Library System Building next generation consortium services Part 3: The National Metadata Repository, Discovery Service Finna, and the New Library System Kristiina Hormia-Poutanen, Director of Library Network Services

More information

UCLA Graduate School of Education and Information Studies UCLA

UCLA Graduate School of Education and Information Studies UCLA UCLA Graduate School of Education and Information Studies UCLA Peer Reviewed Title: Slides for When use cases are not useful: Data practices, astronomy, and digital libraries Author: Wynholds, Laura, University

More information

Exploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing.

Exploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing. Exploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing. Dr Liz Lyon, UKOLN, University of Bath Introduction and Objectives UKOLN is undertaking

More information

Big Data Standardisation in Industry and Research

Big Data Standardisation in Industry and Research Big Data Standardisation in Industry and Research EuroCloud Symposium ICS Track: Standards for Big Data in the Cloud 15 October 2013, Luxembourg Yuri Demchenko System and Network Engineering Group, University

More information

On Establishing Big Data Breakwaters

On Establishing Big Data Breakwaters On Establishing Big Data Breakwaters with Analytics Dr. - Ing. Morris Riedel Head of Research Group High Productivity Data Processing, Juelich Supercomputing Centre, Germany Adjunct Associated Professor,

More information

This vision will be accomplished by targeting 3 Objectives that in time are further split is several lower level sub-objectives:

This vision will be accomplished by targeting 3 Objectives that in time are further split is several lower level sub-objectives: Title: Common solution for the (very-)large data challenge Acronym: VLDATA Call: EINFRA-1 (Focus on Topic 5) Deadline: Sep. 2nd 2014 This proposal complements: Title: e-connecting Scientists Call: EINFRA-9

More information

Doing Multidisciplinary Research in Data Science

Doing Multidisciplinary Research in Data Science Doing Multidisciplinary Research in Data Science Assoc.Prof. Abzetdin ADAMOV CeDAWI - Center for Data Analytics and Web Insights Qafqaz University aadamov@qu.edu.az http://ce.qu.edu.az/~aadamov 16 May

More information

Archival of raw and analysed radar data at EISCAT and worldwide

Archival of raw and analysed radar data at EISCAT and worldwide Archival of raw and analysed radar data at EISCAT and worldwide Carl-Fredrik Enell, EISCAT Scientific Association COOPEUS workshop and EGI-CC kickoff, 11 March 2015 C-F Enell, EISCAT Radar data archival

More information

e-science and technology infrastructure for biodiversity research

e-science and technology infrastructure for biodiversity research e-science and technology infrastructure for biodiversity research Wouter Los Coordinator of the Preparatory Project University of Amsterdam (institute of Biodiversity and Ecosystem Dynamics) Outline Users

More information

Big Data Hope or Hype?

Big Data Hope or Hype? Big Data Hope or Hype? David J. Hand Imperial College, London and Winton Capital Management Big data science, September 2013 1 Google trends on big data Google search 1 Sept 2013: 1.6 billion hits on big

More information

Policy Policy--driven Distributed driven Distributed Data Management (irods) Richard M arciano Marciano marciano@un marciano @un.

Policy Policy--driven Distributed driven Distributed Data Management (irods) Richard M arciano Marciano marciano@un marciano @un. Policy-driven Distributed Data Management (irods) Richard Marciano marciano@unc.edu Professor @ SILS / Chief Scientist for Persistent Archives and Digital Preservation @ RENCI Director of the Sustainable

More information

SURFsara Data Services

SURFsara Data Services SURFsara Data Services SUPPORTING DATA-INTENSIVE SCIENCES Mark van de Sanden The world of the many Many different users (well organised (international) user communities, research groups, universities,

More information

The Challenge of Handling Large Data Sets within your Measurement System

The Challenge of Handling Large Data Sets within your Measurement System The Challenge of Handling Large Data Sets within your Measurement System The Often Overlooked Big Data Aaron Edgcumbe Marketing Engineer Northern Europe, Automated Test National Instruments Introduction

More information

Big Data in BioMedical Sciences. Steven Newhouse, Head of Technical Services, EMBL-EBI

Big Data in BioMedical Sciences. Steven Newhouse, Head of Technical Services, EMBL-EBI Big Data in BioMedical Sciences Steven Newhouse, Head of Technical Services, EMBL-EBI Big Data for BioMedical Sciences EMBL-EBI: What we do and why? Challenges & Opportunities Infrastructure Requirements

More information

Exploitation of ISS scientific data

Exploitation of ISS scientific data Cooperative ISS Research data Conservation and Exploitation Exploitation of ISS scientific data Luigi Carotenuto Telespazio s.p.a. Copernicus Big Data Workshop March 13-14 2014 European Commission Brussels

More information

Integrating Research Information: Requirements of Science Research

Integrating Research Information: Requirements of Science Research Integrating Research Information: Requirements of Science Research Brian Matthews Scientific Information Group E-Science Centre STFC Rutherford Appleton Laboratory brian.matthews@stfc.ac.uk The science

More information

CSC590: Selected Topics BIG DATA & DATA MINING. Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait

CSC590: Selected Topics BIG DATA & DATA MINING. Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait CSC590: Selected Topics BIG DATA & DATA MINING Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait Agenda Introduction What is Big Data Why Big Data? Characteristics of Big Data Applications of Big Data Problems

More information

From Distributed Computing to Distributed Artificial Intelligence

From Distributed Computing to Distributed Artificial Intelligence From Distributed Computing to Distributed Artificial Intelligence Dr. Christos Filippidis, NCSR Demokritos Dr. George Giannakopoulos, NCSR Demokritos Big Data and the Fourth Paradigm The two dominant paradigms

More information

European Plate Observing System: a Long-term Integration Plan for Solid Earth Sciences

European Plate Observing System: a Long-term Integration Plan for Solid Earth Sciences European Plate Observing System: a Long-term Integration Plan for Solid Earth Sciences Massimo Cocco & EPOS PP Team AAAS 2011 Meeting Washington DC February 2011 INGV What is EPOS? EPOS is a long-term

More information

The PhysiomeSpace data sharing service and the new VPH-Share end-user interface

The PhysiomeSpace data sharing service and the new VPH-Share end-user interface The PhysiomeSpace data sharing service and the new end-user interface Matteo Balasso SCS srl, Italy Outline Introduction Past experience: PhysiomeSpace user interface Knowledge sharing Sharing data, models

More information

DataGrids 2.0 irods - A Second Generation Data Cyberinfrastructure. Arcot (RAJA) Rajasekar DICE/SDSC/UCSD

DataGrids 2.0 irods - A Second Generation Data Cyberinfrastructure. Arcot (RAJA) Rajasekar DICE/SDSC/UCSD DataGrids 2.0 irods - A Second Generation Data Cyberinfrastructure Arcot (RAJA) Rajasekar DICE/SDSC/UCSD What is SRB? First Generation Data Grid middleware developed at the San Diego Supercomputer Center

More information

Data Wrangling: From the Wild to the Lake

Data Wrangling: From the Wild to the Lake Data Wrangling: From the Wild to the Lake Ignacio Terrizzano Peter Schwarz Mary Roth John Colino IBM Research - Almaden 48 hours of video is uploaded to YouTube every minute Walmart processes million transactions

More information

Digital Communication and Interoperability - A Case Study

Digital Communication and Interoperability - A Case Study CLARIN: a pan-european research infrastructure for language resources Martin Wynne Martin.wynne@it.ox.ac.uk Oxford e-research Centre & IT Services (formerly OUCS) & Faculty of Linguistics, Philology and

More information

Computing Strategic Review. December 2015

Computing Strategic Review. December 2015 Computing Strategic Review December 2015 Front cover bottom right image shows a view of dark matter at redshift zero in the Eagle simulation, by the Virgo Consortium using the DiRAC Facility (Durham Data

More information

Image Data, RDA and Practical Policies

Image Data, RDA and Practical Policies Image Data, RDA and Practical Policies Rainer Stotzka and many others KIT University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association www.kit.edu Data Life Cycle Lab

More information

The Knowledge Sharing Infrastructure KSI. Steven Krauwer

The Knowledge Sharing Infrastructure KSI. Steven Krauwer The Knowledge Sharing Infrastructure KSI Steven Krauwer 1 Why a KSI? Building or using a complex installation requires specialized skills and expertise. CLARIN is no exception. CLARIN is populated with

More information

CMIP6 Data Management at DKRZ

CMIP6 Data Management at DKRZ CMIP6 Data Management at DKRZ icas2015 Annecy, France on 13 17 September 2015 Michael Lautenschlager Deutsches Klimarechenzentrum (DKRZ) With contributions from ESGF Executive Committee and WGCM Infrastructure

More information

Extending SharePoint for Real-time Collaboration: Five Business Use Cases and Enhancement Opportunities

Extending SharePoint for Real-time Collaboration: Five Business Use Cases and Enhancement Opportunities Extending SharePoint for Real-time Collaboration: Five Business Use Cases and Enhancement Opportunities Published: December 2012 Evolving SharePoint for Real-time Collaboration: Contents Section Executive

More information

Data Centric Systems (DCS)

Data Centric Systems (DCS) Data Centric Systems (DCS) Architecture and Solutions for High Performance Computing, Big Data and High Performance Analytics High Performance Computing with Data Centric Systems 1 Data Centric Systems

More information

RDA Report Working Meeting Session 5 IG Federated Identity Management. Presentations

RDA Report Working Meeting Session 5 IG Federated Identity Management. Presentations RDA Report Working Meeting Session 5 IG Federated Identity Management Notes by F VandenBoom Presentations The AARC project, report by Licia Florio https://aarcproject.eu by improving the interoperability

More information

Big Data Europe

Big Data Europe BIG DATA EUROPE SC1 Hangout Big Data Challenge in Health www.big-data-europe.eu Empowering Communities with Data Technologies Agenda for Today Welcome! Brief into and background (OPF) Introduction to the

More information

DRIVER Providing value-added services on top of Open Access institutional repositories

DRIVER Providing value-added services on top of Open Access institutional repositories DRIVER Providing value-added services on top of Open Access institutional repositories Dr Dale Peters Scientific Technical Manager : DRIVER SUB Goettingen Germany Gaining the momentum: Open Access and

More information

e-irg Blue Paper on Data Management

e-irg Blue Paper on Data Management e-irg Blue Paper on Data Management FINAL VERSION 30 October 2012 Page 1 of 50 Table of contents 1. Executive summary...4 2. Introduction - scope of the document...5 3. Grand challenges and requirements...6

More information

NASA s Big Data Challenges in Climate Science

NASA s Big Data Challenges in Climate Science NASA s Big Data Challenges in Climate Science Tsengdar Lee, Ph.D. High-end Computing Program Manager NASA Headquarters Presented at IEEE Big Data 2014 Workshop October 29, 2014 1 2 7-km GEOS-5 Nature Run

More information

Building an Ecosystem to Accelerate Data-Driven Innovation

Building an Ecosystem to Accelerate Data-Driven Innovation Building an Ecosystem to Accelerate Data-Driven Innovation Dr. Francine Berman Chair, Research Data Alliance / US Edward P. Hamilton Distinguished Professor of Computer Science, Rensselaer Polytechnic

More information

CHPC initiatives in Data and HPC in South Africa. An initiative of the Department of Science and Technology Managed by the CSIR Meraka Institute

CHPC initiatives in Data and HPC in South Africa. An initiative of the Department of Science and Technology Managed by the CSIR Meraka Institute CHPC initiatives in Data and HPC in South Africa An initiative of the Department of Science and Technology Managed by the CSIR Meraka Institute 2 CHPC initiatives in Data and HPC in South Africa The South

More information

An analysis of Big Data ecosystem from an HCI perspective.

An analysis of Big Data ecosystem from an HCI perspective. An analysis of Big Data ecosystem from an HCI perspective. Jay Sanghvi Rensselaer Polytechnic Institute For: Theory and Research in Technical Communication and HCI Rensselaer Polytechnic Institute Wednesday,

More information

A public-private partnership building a multidisciplinary cloud platform for data intensive science

A public-private partnership building a multidisciplinary cloud platform for data intensive science A public-private partnership building a multidisciplinary cloud platform for data intensive science Bob Jones Head of openlab IT dept CERN 3 September 2013 This document produced by Members of the Helix

More information

Data Management Considerations for the Data Life Cycle

Data Management Considerations for the Data Life Cycle Data Management Considerations for the Data Life Cycle NRC STS Panel 2011 November 17, 2011, Washington DC Peter Fox (RPI) foxp@rpi.edu, pfox@cs.rpi.edu Tetherless World Constellation http://tw.rpi.edu

More information