D4.2 - Analysis of Data Infrastructure and Data repositories

Size: px
Start display at page:

Download "D4.2 - Analysis of Data Infrastructure and Data repositories"

Transcription

1 Project acronym Project full title : CHAIN-REDS Grant agreement : Start date : December 1, 2012 Duration Programme Theme Thematic area Funding scheme Call identifier Project coordinator : Co-ordination & Harmonisation of Advanced and e-infrastructures for Research Education Data Sharing : 30 months : 7th Framework Programme (FP7) : Capacities specific program : Research Infrastructures : Support action : FP7 INFRASTRUCTURES : Federico Ruggieri (INFN) D4.2 - Analysis of Data Infrastructure and Data repositories Deliverable Status : Draft or Final File Name : CHAIN-REDS-D4.2_V05 Due Date : November 2013 (M12) Submission Date : November 2013 (M12) Dissemination Level : Public Author : CIEMAT (rafael.mayo@ciemat.es) Copyright The CHAIN-REDS Consortium INFN CIEMAT GRNET CESNET UBUNTUNET CLARA IHEP ASREN Istituto Nazionale di Fisica Nucleare - Italy Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas - Spain Greek Research and Technology Network S.A. - Greece Zajmove Sdruzeni Pravnickych Osob - Czech Republic The UbuntuNet Alliance for Research and Education Networking - Malawi Cooperación Latinoamericana de Redes Avanzadas - Uruguay Institute of High Energy Physics Chinese Academy of Sciences - China Arab States Research and Education Network - Jordan SIGMA. Sigma.. Orionis - France C-DAC Centre for Development of Advanced Computing - India proj-office@chain-project.eu Grant Agreement n

2 CHAIN-REDS Project - Deliverable D4.2 Page #2 Disclaimer More details on the copyright holders can be found at CHAIN-REDS ( Co-ordination & Harmonisation of Advanced e-infrastructures for Research and Education Data Sharing ) is a project co-funded by the European Union in the framework of the 7 th FP for Research and Technological Development, as part of the Capacities specific program - Research Infrastructures FP7 INFRASTRUCTURES For more information on the project, its partners and contributors visit hwww.chain-project.eu. You are permitted to copy and distribute verbatim copies of this document containing this copyright notice, but modifying this document is not allowed. You are permitted to copy this document in whole or in part into other documents if you attach the following reference to the copied elements: "Copyright (C) CHAIN-REDS Consortium - information contained in this document represents the views of the CHAIN-REDS Consortium as of the date they are published. The CHAIN-REDS Consortium does not guarantee that any information contained herein is error-free, or up to date. THE CHAIN CONSORTIUM MAKES NO WARRANTIES, EXPRESS, IMPLIED, OR STATUTORY, BY PUBLISHING THIS DOCUMENT. Revision Control Issue Date Comment Author v01 06/11/2013 First version Rafael Mayo-García v02 08/11/2013 New info added Margaret Ngwira, Bruce Becker and Rafael Mayo-García V03 20/11/2013 Comments and suggestions Ognjen Prjnat, Rafael Mayo-García V04 29/11/2013 New information added Roberto Barbera, Rafael Mayo-García V05 03/12/2013 Comments and edition Ognjen Prjnat, Roberto Barbera, Rafael Mayo-García 2

3 CHAIN-REDS Project - Deliverable D4.2 Page #3 Abstract This document reports on the analysis of Data Infrastructure and Data repositories carried out by WP4 Data Infrastructure during the first year of the CHAIN-REDS project. Bearing in mind the increasing importance of data management and data analytics in big data issues, it describes the current status of the project objectives as well as of the collaborative Virtual Research Communities of CHAIN-REDS: eifl, aginfra, ENGAGE and EarthServer. Furthermore, information about other initiatives is also presented, in particular the one related to the main European Data initiative EUDAT. Based on this, an analysis of the adoption of the standards promoted by CHAIN-REDS by these communities and on the transcontinental coverage that the current computational infrastructures are offering to the final users is described. This information is complemented with the actions that have been carried out by the project for enhancing this transcontinental impact. After a description of the CHAIN-REDS tools that are being implemented for proposing a methodology for achieving computing and data trust building, i.e. the CHAIN-REDS Knowledge Base and the Semantic Search Engine, the first results of their use is detailed as well as the plans and the road map that the project is going to follow to extend the aforementioned data trust building. 3

4 CHAIN-REDS Project - Deliverable D4.2 Page #4 Table of contents ABSTRACT 3 TABLE OF CONTENTS 4 PURPOSE 5 GLOSSARY 5 1. INTRODUCTION 7 2. UPDATED INFORMATION OF TRANSCONTINENTAL DATA INFRASTRUCTURE AND DATA REPOSITORIES _ WP4 OBJECTIVES CHAIN-REDS Collection of Data General data related initiatives EUDAT, EIFL and documents repositories Agriculture aginfra e-government ENGAGE and H3Africa Earth Sciences EarthServer and SAEON Cultural Heritage DCH-RP and the University of Cape Town Astrophysics IVOA and SKA e-infrastructure Data imentors WP4 Dissemination actions ANALYSIS ON DATA INFRASTRUCTURES UNDER THE CHAIN-REDS PERSPECTIVE Standards Transcontinental coverage THE CHAIN-REDS TOOLS The CHAIN-REDS Knowledge Base The CHAIN-REDS Semantic Search Engine THE CHAIN-REDS WORK PLAN ON DATA INFRASTRUCTURES The worldwide interoperability demo Adding data functionalities to the CHAIN-REDS demo 34 CONCLUSIONS 36 4

5 CHAIN-REDS Project - Deliverable D4.2 Page #5 Purpose CHAIN-REDS is a FP7 project co-funded by the European Commission (DG CONNECT) which started on December, 1 st 2012 and aims at promoting and supporting technological and scientific collaboration across different e-infrastructures established and operated in various continents, in order to define a path towards a global e-infrastructure ecosystem that will allow Virtual Research Communities (VRCs), research groups and even single researchers to access and efficiently use worldwide distributed resources (i.e., computing, storage, data, services, tools, applications). The purpose of this deliverable is to provide a study of the commonalities, differences, requirements and future challenges that the identified Data Infrastructure and Data Repositories have. This is a first version delivered at the end of the first year, which includes a first updated edition of D4.1 Trans-continental Data Infrastructures and Data repositories. Future updated versions of D4.2 Analysis of Data Infrastructures and Data repositories will be part of D4.3 and D4.4 on months M18 and M24 respectively. In addition to that updated information on Data Infrastructure and characteristics, contents about the CHAIN-REDS tool that manage and exploit data, and the roadmap of the project for demonstrating data trust building are described. Glossary API CDMI CHAIN CHAIN-REDS CNRI DCI DCMI DoW DR EC EGI FOAF FP7 GA ICT IVOA KB MoU OADR OAI-PMH OCCI Application Programming Interface Cloud Data Management Interface Co-ordination and Harmonisation of Advanced e-infrastructures Co-ordination and Harmonisation of Advanced e-infrastructures for Research Education Data Sharing Corporation for National Research Initiatives Distributed Computing Infrastructure Dublin Core Metadata Initiative Description of Work Annex I to the GA Data Repository European Commission European Grid Initiative Friend Of A Friend machine readable ontology European Commission s Framework Programme Seven Grant Agreement Information and Communication Technology(ies) International Virtual Observatory Alliance Knowledge Base Memorandum of Understanding Open Access Data Repository Open Archives Initiative Protocol for Metadata Harvesting Open Cloud Computing Interface 5

6 CHAIN-REDS Project - Deliverable D4.2 Page #6 OWL PID RDF ROC SKA SPARQL VRC VRE WP XML Ontology Web Language Persistent IDentifier Resource Description Framework Regional Operation Centre Square Kilometre Array SPARQL Protocol and RDF Query Language Virtual Research Community Virtual Research Environment Work Package Extensible Markup Language 6

7 CHAIN-REDS Project - Deliverable D4.2 Page #7 1. Introduction CHAIN-REDS started on December, 1 st 2012 and aims at promoting and supporting technological and scientific collaboration across different e-infrastructures established and operated in various continents. It is a FP7 project co-funded by the European Commission (DG CONNECT) and has as ultimate goal to define a path towards a global e- Infrastructure ecosystem that will allow Virtual Research Communities (VRCs), research groups and even single researchers to access and efficiently use worldwide distributed resources (i.e., computing, storage, data, services, tools, applications). To do so, the project is structured in several Work Packages that addresses these different scenarios. Specifically, WP4 Data Infrastructure deals with the promotion of trust building towards open scientific data infrastructures across the world regions, including organisational, operational and technical aspects with a strong liaison with WP3 Interoperation and coordination of e-infrastructures and WP5 Support to small groups and emerging communities Activities. Regardless of what could be thought of the ubiquity of the "Big Data" meme, it is clear that the growing size and diversity of datasets are changing the way we approach the world around us. This is true in fields from industry to government to media to academia and virtually everywhere in-between. Our increasing abilities to gather, process, visualize, and learn from large datasets is helping to push the boundaries of our knowledge. Nowadays, open access to data is becoming a must and is being promoted by many entities. The European Commission published in Oct 2010 the report entitled Riding the wave. How Europe can gain from the rising tide of scientific data and has recently declared the IP/12/790Communication 1, where it is considered that open access is a fundamental requirement for improving the flow of knowledge and jointly with it, innovation in Europe. Thus, open access shall be required for all scientific publications carried out with funding from Horizon 2020, which is the next EU program for funding research and innovation during the period. The Communication recommended that the Member States in their national programs adopt a similar approach from that of the Commission. In addition, this European Commission study, which also focused on the EU and neighbouring countries as well as on Brazil, Canada, Japan and the United States of America, states that over 40% of peer-reviewed scientific articles and published worldwide between 2004 and 2011 is now available online in open access regime. Nevertheless, the widening of this action, mainly focused on document repositories, to actual data that could be exploited by as many researchers and users as possible is a must. Furthermore, such a use and management of data should be done in accordance to the advances that are being carried out in e-infrastructures, which is the topic of CHAIN- REDS WP3. In order to study the opportunities of data sharing across different e-infrastructures and continents, two main actions have been done so far: collect information from Data Repositories worldwide and widen the scope of the previous CHAIN Knowledge Base (KB) 2 to Data Infrastructures. At the end of the first year of CHAIN-REDS, when this report is being delivered, information about Data Repositories (DR) and Open Access Document Repositories (OADR) is extensively provided to the users by accessing the CHAIN-REDS KB and the 1 IP/12/790Communication, available at 2 CHAIN Knowledge Base (KB), available at 7

8 CHAIN-REDS Project - Deliverable D4.2 Page #8 project manages information about the commonalities, differences, requirements and future challenges that the identified data communities have. All of these are detailed in Section 2 and 3 of this document. Furthermore, it was felt necessary by the CHAIN-REDS Consortium to extend the KB capabilities to deal with data methodologies: semantic search; semantic web based metadata enrichment; download and upload of data; etc. Documented information about these implementations is provided in Section 4. Finally, and due to the fact that the main objective of WP4 is to provide proof-of principle use-cases for Data sharing across continents, the road map described in D4.1 Transcontinental Data Infrastructures and Data repositories has been redefined in order to reach this ultimate goal. It is worth mentioning that such a redefinition has actually implied an extension of activities whose results are summarised in Section 5. 8

9 CHAIN-REDS Project - Deliverable D4.2 Page #9 2. Updated Information of Transcontinental Data Infrastructure and Data Repositories Huge quantities of observational data and simulations are becoming available to researchers at an ever-accelerating rate and are transforming science impacting on the Scientific Method which is, since four centuries, the iterative procedure used by scientists and researchers to go through the so-called knowledge path. For example, new astronomical observatories anticipate delivering combined data volumes of over 100 PB by 2020, yet even the current data volume of 1 PB is beginning to strain archives 3. At the same time, simulations are increasing in complexity and scope. Thus, there is a clear growth in volume, but also in the complexity of products, often derived through integrating existing data sets and confronting them with simulations by teams of distributed, often international, collaborations. Sciences will then witness a breakdown in their current computing model if no intervention is made. Furthermore, data are discovered and downloaded through webbased services offered by archives and data centres, and then analysed and integrated on local machines, i.e., the very scale of new data sets will transform data discovery, access, and computation in many disciplines. Given that maximum science return will involve federation of data sets, discovery of data or simulations will be performed through queries to distributed archives; these queries will aim to locate data having particular properties and even store these data, and so data will be processed in situ by archives running users software or shipped to remote processing facilities. In other words, as it was claimed by the U.S. National Science Foundation (NSF), it is required the inclusion of a data management plan, whose provisions include sharing the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Proposals in response to other funding bodies, such as the European Commission Directive previously mentioned, have similar requirements. As a consequence, it is necessary to first know which, where and how data is being stored, and, second, to implement a mechanism that allows researchers to access and further exploit such data for producing new science. At the same time, new data and even publications that were produced should be lately stored in the same conditions than the previous input data used. As it was described in D4.1 4, the first step taken by CHAIN-REDS was to identify best practices and to approach the main stakeholders of regional e-infrastructures and of data provisioning/use in order to propose and even define a path towards a global e- Infrastructure ecosystem that will allow VRCs, research groups and even single researchers to access and efficiently use worldwide distributed resources, i.e. computing, storage, data, services, tools, applications. A survey about DRs and OADRs for feeding the CHAIN-REDS KB was performed and commonalities in the strategies followed by some VRCs were identified. As a result, some communities were approached and even official collaborations agreed. In what follows, updated information on the status of these collaborations is explained. 3 G. Bruce Berriman, The role in the Virtual Astronomical Observatory in the era of massive data sets 4 9

10 CHAIN-REDS Project - Deliverable D4.2 Page # WP4 objectives For the sake of completeness, in this subsection the most important information about WP4 main objectives and the actions taken to achieve them until M12 is summarised. The five objectives identified in the DoW and their current status are summarised in Table 1. Objective Current status (as that of Nov 2013) Extend the CHAIN-REDS KB The CHANI-REDS KB already counts with a huge with Data Infrastructures amount of links to OADRS and DRs (almost 3,000 in total). New data-related capabilities have been also added to the Semantic Search, Applications and Science Gateway links Support the study of data infrastructures for a few VRCs Promote trust building towards open scientific data infrastructures across the world regions Study the opportunities of data sharing across different e-infrastructures and continents Provide proof-of principle usecases for Data sharing across the continents CHAIN-REDS has established official MoUs with several VRCs and some conversations are underway with others WP4 jointly with WP3 has set-up a road-map to demonstrate such a trust building, which includes (data) infrastructure worldwide. The project is looking for datasets already stored worldwide jointly with the identified VRCs, which enhance the impact of the aforementioned demo. The demo mentioned in the third objective is expected to be showed during the last months of the project lifetime. Table I. CHAIN-REDS objectives and their current status In D4.1 the reader can find a list of actions (Action1-Action5) that were already established as the first steps to be taken in order to accomplish the objectives appearing in Table I. All of them have been almost performed and only a deeper analysis on the DRs and OADRs of interest to CHAIN-REDS and to the collaborative VRCs with a worldwide impact is still on the way. For performing such an analysis, some specific questions have been requested from the VRCs (see Section 3). 2.2 CHAIN-REDS Collection of Data The CHAIN project, the precursor of CHAIN-REDS, already promoted interoperability as one of its main objectives. A worldwide multi-middleware interoperability demonstration 5 was given in September 2012 at the EGI Technical Forum and the project carried out several actions that were a step further in facilitating the access to information coming from different regions. Thus, the KB provided dynamically updated information about the deployment of e-infrastructure related topics per country and even about specific Distributed Computing Infrastructures (DCIs) by means of a Site or a Table view. All these concepts conducted to a validation model for VRCs that was successfully tested by the end of CHAIN by counting on the aforementioned worldwide demo and the road-map of services requested from the VRCs to the DCIs. For CHAIN-REDS, new data-oriented capabilities have been implemented (see Section 4). The first of them was to include in the KB information about DRs and OADRs. Such a compilation was obtained on a three-fold basis:

11 CHAIN-REDS Project - Deliverable D4.2 Page #11 - By a multi-layer structure where a metadata harvester, running either on Grid or Cloud, fetches metadata from OAI-PMH end-points of many OADRs and DRs; - By direct integration from repositories that CHAIN-REDS was aware of by means of the survey described in D4.1; and, - By direct contacts between the project and other initiatives mainly devoted to data storage. Figure 1. A snapshot of the CHAIN-REDS Knowledge Base - OADR Site view Most of the available links to open access repositories were obtained using the first item. Information about the data included from the results of the CHAIN-REDS survey are deeply explained in D4.1 and specific agreements with data initiatives as those coming from DRIVER 6, OpenAIRE 7, OpenDOAR 8, Databib 9 or DataCite 10. Nowadays, the KB contains links to 2,579 entries from OADRs and 596 from DRs (Oct, 24 th 2013), i.e. from the last D4.1 deliverable information, 91 OADRs and 89 DRs have been incorporated. In order to visualise and access the repositories, users can employ both geo- and tabviews. In the former, red markers refer to data currently taken from the almost 2,500 OADRs of DRIVER, OpenAIRE, and OpenDOAR and currently refer to more than 30 million documents. Yellow markers refer to other OADRs, i.e. those integrated by means of the

12 CHAIN-REDS Project - Deliverable D4.2 Page #12 outreach activities of the project (previous second and third item), such those belonging to La Referencia 11 in Latin America and those pointed out by EIFL in Europe and Africa (see next sub-section). For each OADR/DR, the following information is provided: the country where the data is stored; the name of the repository (with a direct link to its home page); the scientific domain it belongs to; and, the organisation is maintaining it. 2.3 General data related initiatives EUDAT, EIFL and documents repositories As it has been aforementioned, the first kind of repositories that CHAIN-REDS has been working with has been document repositories, that is, articles, papers and proceedings which have been included in the project KB either classified by OADRs or DRs. This action has been of outmost importance because it has allowed the project to develop semantic methodologies for improving the retrieval of data and related information and, also, plan a major challenge as the extraction of raw data from articles (see Sections 4 and 5). In addition, CHAIN-REDS has identified some general data related initiatives that represent major actors in the European landscape. Among them, the main candidate is EUDAT 12. This initiative was created for working on data management and aims at providing European researchers from all fields with state-of-the-art instruments and services that support the deployment of new research facilities on a pan-european level. From the first contacts, further interaction between EUDAT and CHAIN-REDS has been established, where the commonalities of both projects have been detailed and the CHAIN-REDS KB has been proposed as an example of data management and standards adoption by means of the OADRs and DRs views. In addition, specific meetings have been held between representative personnel from both projects were specific actions have been scheduled. Thus, the EUDAT technical coordinator, Prof. Peter Wittenburg, has participated in the CHAIN-REDS workshop 13 held as part of the IEEE e-science Conference (Beijing, 22 Oct 2013) and the CHAIN-REDS Project Coordinator, Technical Coordinator, WP4 Manager and T3.1 Leader have participated in the 2 nd EUDAT conference 14 (Rome, Oct 2013). From this point on, a MoU between the two initiatives has been internally discussed and agreed within the EUDAT consortium and further collaboration by combining both initiatives developments (KB and Persistent Identifiers in the case of CHAIN-REDS) is expected. Additional joint work has been set up between EIFL 15 and CHAIN-REDS. Working in collaboration with libraries in more than 60 developing and transition countries in Africa, Asia, Europe, and Latin America, EIFL enables access to knowledge for education, learning, research and sustainable community development. Then, in accordance to the road map proposed by CHAIN-REDS described in Section 5 and because of the common regions of interest to both initiatives, common actions are expected for In this sense, CHAIN-REDS will propose its current

13 CHAIN-REDS Project - Deliverable D4.2 Page #13 technical developments as a proof-of-principle methodology for better using data. In the last six months, a MoU 16 has been signed between EIFL and CHAIN-REDS and EIFL has provided the OAI-PMH of several OADRs which have been added to the KB. 2.4 Agriculture aginfra One of the main promising fields in data and metadata management is agricultural science. A metadata framework has even been introduced by the Food and Agriculture Organization of the United Nations (FAO) and it recognized that there was a strong need for statistical metadata, which would provide better understanding of all the data items and the way to obtain them within the national system of agricultural statistics. The idea in the agricultural field is to establish metadata databases for food and agricultural as key components for improving data quality and statistical development. The concept of metadata here describes all aspects of the national systems of agricultural statistics on how, when, where, why, and by whom the data are collected. The challenge faced by the management of metadata at the international level is how to design a framework so it can be used by countries to collect the relevant and succinct information in a manageable and comparable way. In order to accomplish such a task in an affordable way, CHAIN- REDS has established a collaboration with the FP7 aginfra 17 project, which is participated by an international institution (FAO) and partners from Europe, Asia and Latin America. aginfra aims to set up a data infrastructure to support agricultural scientific communities promoting data sharing and development of trust in agricultural sciences. It also plans to improve service deployment for data by transferring scientific and technological results from the agricultural field into real outcomes. In addition, even when agriculture is a resource in almost every country in the world, from the e-infrastructure point of view, it is also a key point that aginfra is relying its computing power and administration on the same infrastructures targeted by CHAIN-REDS in WP3 and has also adopted the Science Gateway paradigm. In the last six months, a MoU has been signed between aginfra and CHAIN-REDS, which is part of the WP4 milestone MS6- MoUs signed with at least 2 VRCs fulfilment. The first common action has been an exchange of information about the use of the standards adopted by CHAIN-REDS by aginfra. This is of importance in order to propose to aginfra and the agriculture community the CHAIN-REDS road map of data trust building that is initially being tested with generic applications. Furthermore, in the context of the MoU between the two projects, the Semantic Search part of the aginfra website 18 has been restructured according to the new capabilities developed within the CHAIN-REDS consortium. By now, it is possible to perform a semantic search in parallel, i.e. a new functionality that allows users to search in parallel across the millions of resources contained in the CHAIN-REDS Knowledge Base and in the FAO OpenAgris repository CHAIN-REDS MoUs available at

14 CHAIN-REDS Project - Deliverable D4.2 Page # e-government ENGAGE and H3Africa In e-government, public agencies are responsible for providing access to information and services for everyone living within a country or a region, all of whom will have varying levels of IT skills including individuals with lower incomes and disabilities. This is why organizing e-government collections on the internet, in a way that helps users to search and locate government information without needing details of government structure, or to find government services without knowing which agency delivers them, is a fundamental activity in e-government. Thus, metadata is a valuable tool in e-government applications to make seamless flow of information and services across government and to support citizens finding government information and services more easily. CHAIN-REDS has identified the FP7 ENGAGE project as an ideal initiative to collaborate with. ENGAGE 20 is an infrastructure for open, linked governmental data provision both for research communities and citizens. The ENGAGE e-infrastructure is envisaged to promote a highly synergetic approach to governance research by providing the ground for experimentation to actors from both ICT and non ICT related disciplines and scientific communities, as well as by ensuring that the scientific outcomes are made accessible to the citizens, so that they can monitor public service delivery and influence the decision making process. ENGAGE will provide enhanced services in the data e-infrastructure layer while on the other hand building a community that can exploit the e-infrastructure services. The project has developed a platform (currently in a beta version), which has already been a good point for collaboration between this initiative and both the CHAIN-REDS consortium and the groups interested in e-government approached worldwide by means of the CHAIN-REDS WP4 survey. Thus, a fist common action has been a deep analysis of the ENGAGE platform by the CHAIN-REDS management. ENGAGE coordinators contacted CHAIN-REDS and their feedback about the platform functionality was submitted. Within its field of expertise, the ENGAGE platform provides access to 14,379 datasets (Nov 2 nd 2013) already searchable by SPARQL queries. Later on, there will be an exchange of information about the use of the standards adopted by CHAIN-REDS and ENGAGE. This is of importance in order to propose to ENGAGE and the e-government community the CHAIN-REDS road map of data trust building that is initially being tested with generic applications. In the last six months, a MoU has been signed between ENGAGE and CHAIN-REDS, which is part of the WP4 milestone MS6- MoUs signed with at least 2 VRCs fulfilment. In the context of that MoU between the two projects, the Semantic Search part of the CHAIN-REDS website has been restructured. Now it is possible to perform a semantic search on a two-fold basis: Single, i.e., the usual semantic search service that is described below in this document 21 ; and, Parallel, a new functionality that allows users to search in parallel across the millions of resources contained in the CHAIN-REDS Knowledge Base and in the ENGAGE Platform

15 CHAIN-REDS Project - Deliverable D4.2 Page #15 CHAIN-REDS has also approached H3Africa 23. The Human Heredity and Health in Africa initiative aims to facilitate a contemporary research approach to the study of genomics and environmental determinants of common diseases with the goal of improving the health of African populations. To accomplish this, the H3Africa Initiative aims to contribute to the development of the necessary expertise among African scientists, and to establish networks of African investigators. Furthermore, data generated from this effort will inform strategies to address health inequity; the final goal is then to develop a scientific case for a pilot study of a specific disease(s) and producing some general principles/guidelines for collaboration, data sharing and addressing Ethical, Legal and Social Issues. 2.6 Earth Sciences EarthServer and SAEON With increasing volumes of satellite and remote sensing, models and other Earth Science data available and the popularity of the Internet, Earth scientists are now facing challenges to publish and to find interesting data sets effectively and efficiently. One of the main barriers to exploiting the great wealth of global Earth science data available today is that researchers are unable to rapidly search and find data relevant to their studies. For exploring the CHAIN-REDS objectives in this field, the project is collaborating with the FP7 initiative EarthServer 24, which is working on establishing open access and ad-hoc analytics on several extreme-size Earth Science data related to cryospheric, airborne, atmospheric and planetary sciences and also to geology and oceanography. EarthServer, counts on a Science Gateway based on that developed by CHAIN-REDS project, so a close collaboration has been set up between this two projects in order to build trust building. To achieve such a goal, it is also worth mentioning that both projects share the same e-infrastructures and that the data collected in EarthServer are of interest in any regions of the world. In the last six months, a MoU has been signed between EarthServer and CHAIN-REDS, which is part of the WP4 milestone MS6-MoUs signed with at least 2 VRCs fulfilment. A deep exchange of information about the use of the standards adopted by CHAIN-REDS by EarthServer has been already made. This is of importance in order to propose to EarthServer and the agriculture community the CHAIN-REDS road map of data trust building that is initially being tested with generic applications. CHAIN-REDS road map. Besides, a discussion about data analytics has been held by the two projects. Considering a couple of tools, SciDB 25 and rasdaman 26, the latter one has been initially selected and further common developments are expected for the near future in accordance to the goals to be achieved by the

16 CHAIN-REDS Project - Deliverable D4.2 Page #16 CHAIN-REDS has also established contacts with the SAEON consortium 27. Its vision corresponds to a sustained, coordinated, responsive and comprehensive in situ South African Earth Observation Network that delivers long-term reliable data for scientific research and informs decision-making for a knowledge society and improved quality of life. Progress in a sustainable development is constrained by the lack of reliable long-term data at scales that are relevant to policy, and by the lack of integration between the various systems that provide information on the environmental, social and economic elements of sustainability. To address this critical gap, SAEON will collect, store and assess appropriate longitudinal social, economic and environmental data to inform relevant research, policy, reporting and action. 2.7 Cultural Heritage DCH-RP and the University of Cape Town The use of computational methods in the humanities is rapidly growing, with the increasing quantities of born-digital primary sources (such as s, social media) and the large-scale digitisation programmes applied to libraries, museums and archives. This has resulted in a range of interesting applications and case studies highlighting at the same time the interpretative issues raised by applying such hard methods for answering subjective questions in the humanities. Moreover, the questions and concerns raised by the humanities themselves have consequences for the interpretation in general of big data and the challenges of producing quality (meaning, knowledge and value) from quantity. Several points are then of interest: text- and data-mining of historical and archival material; social media analysis; crowd-sourcing; archival practices; big data in Heritage; metadata schema, etc. As part of the Dublin Core Metadata Initiative (DCMI), a Cultural Heritage Metadata Task Group is currently active. The Digital Cultural Heritage Roadmap for Preservation 28 (DCH-RP) is a coordination action supported by the European Commission under the e-infrastructure Capacities Programme of Seventh Framework Programme for Research (FP7). The project has been launched on October 2012 to look at best practices for preservation standards in use. DCH-RP aims to: (i) harmonize data storage and preservation policies in the digital cultural heritage sector at European and international level, dealing with the storage phase which includes both long-term preservation and short-term preservation; (ii) progress a dialogue among DCH- RP institutions, einfrastructures, research and private organisations and integrate these efforts in a common work; and, (iii) identify more suitable models for the governance, maintenance and sustainability for such infrastructure. The main outcome for DCH-RP will be a roadmap for the implementation of a preservation federated e-infrastructure, supplemented by practical tools for decision makers. Since it will be validated through a range of proof of concepts, a close collaboration with CHAIN- REDS is expected due to both initiatives sharing the same vision and technological approach. In the last six months, a MoU has been signed between DCH-RP and CHAIN- REDS, which is part of the WP4 milestone MS6-MoUs signed with at least 2 VRCs fulfilment, and further joint actions are on the way

17 CHAIN-REDS Project - Deliverable D4.2 Page #17 At the beginning of October 2013, the CHAIN-REDS Knowledge Base and the Semantic Search Engine were showcased at the eresearch Africa 2013 Conference 29 held in Cape Town (South Africa). This has triggered collaboration with the Metadata Working Group initiative and the Information and Communication Technology Services at the University of Cape Town (UCT) to tailor and adapt the CHAIN-REDS services to the needs of UCT and the South African strategy on open access document and data repositories. 2.8 Astrophysics IVOA and SKA In this community, neither the international collaborations supporting big facilities nor the bureaux or societies dictate how a data centre handles its own archive. However, a Virtual Observatory-layer is needed to translate any locally stored data to an agreed standard. Data providers are then advised to systematically collect metadata about the curation process, assign unique identifiers, describe the general content of a collection, and provide interface and capability parameters of services. In the context of Astrophysics, CHAIN-REDS has identified the International Virtual Observatory Alliance 30 (IVOA) as an ideal collaborator. The Virtual Observatory (VO) is the vision that astronomical datasets and other resources should work as a seamless whole. Many projects and data centres worldwide are working towards this goal. IVOA is an organisation that debates and agrees the technical standards that are needed to make the VO possible. It also acts as a focus for VO aspirations, a framework for discussing and sharing VO ideas and technology, and a body for promoting and publicising the VO. Since its formation in 2002, the IVOA has been working on reaching truly world-wide cohesion in debating and agreeing key astronomical standards, establishing a forum for discussing and debating astronomical data technology in general, as well VO standards in particular, and achieving rapid agreement on an initial set of basic standards (a table exchange format, a specification for simple catalogue and image query services, the definition of metadata describing resources, a dictionary for standardised column names, and a suite of standards allowing the construction of VO registries). The IVOA is also pursuing the provision of further standards, including those needed for virtual storage addressing, single sign on, semantic reasoning, grid and web service modularisation. It counts on a Grid & Web Services Working Group, which has developed and interface to access both PRACE 31 and EGI 32 infrastructures for the scientific computations, which are targeted e-infrastructures in CHAIN-REDS. It is also worth mentioning in the scope of CHAIN-REDS regions the contacts that have been established in principle with the Square Kilometre Array 33 (SKA) in Africa. Nevertheless, it should be pin-pointed that SKA is a major consortium formed by organisations from ten countries (Australia, Canada, China, Germany, Italy, New Zealand, South Africa, Sweden, the Netherlands and the United Kingdom) and one Associate Member (India). In this way, it addresses most of the regions of interest to CHAIN-REDS

18 CHAIN-REDS Project - Deliverable D4.2 Page #18 The SKA will use hundreds of thousands of radio telescopes, in three unique configurations, which will enable astronomers to monitor the sky with an unprecedented detail and survey the entire sky thousands of times faster than any system currently in existence. The SKA telescopes will be co-located in Africa and in Australia. South Africa s Karoo desert will cover the core of the high and mid frequencies of the radio spectrum which will have telescopes spread all over the continent, with Australia s Murchison region covering the low frequency range and hosting the survey instrument. As it could be easily inferred, the huge amount of data that these radio telescopes collect will have to be properly managed and CHAIN-REDS aims to provide SKA with its perspective and solutions. 2.9 e-infrastructure Data imentors imentors 34 (e-infrastructure monitoring evaluation and tracking support system) is a project co-funded by the European Commission's DG CONNECT under the 7 th Framework Programme which aims to build a one-stop-shop data warehouse on all e-infrastructure development projects of Sub-Saharan Africa. By mapping e-infrastructure initiatives, imentors goal is to help scientists, universities, research and education networks as well as policy-makers and international donors gain valuable insights on the gaps and progress made in the region and to enhance the coordination of international actors involved in ICT initiatives in this part of the world. imentors is equipped with advanced Geographic Information and Visualisation Systems along with a robust decision-support system drawing public data from many online databases to assist provide policy support and assist programme planning and implementation. The ultimate objective of imentors is to form a vibrant online community of practice made of international actors and practitioners exchanging of up-todate knowledge and information through online social interactions and dedicated spaces for online collaboration, and encourage the community to adopt and update the platform on its own. Sharing the same raising awareness and providing information approach and addressing a crucial region of the world such as Sub-Saharan Africa, the collaboration between CHAIN-REDS and imentors was deemed very important by both projects and has framed in the context of a Memorandum of Understanding. imentors will provide access to CHAIN-REDS to its Data Warehouse and CHAIN-REDS will provide access to imentors to its Knowledge Base and to its Semantic Search Engine and collaborate to explore how they can be integrated in the imentors platform WP4 Dissemination actions During the first twelve months of CHAIN-REDS, several dissemination actions have been carried out. One of them has been the implementation of a WP4 wiki page 35 where

19 CHAIN-REDS Project - Deliverable D4.2 Page #19 information about the use of the Science Gateway, the Parallel Semantic Search Engine, and the PID service are displayed. It is also worth mentioning the presentations (see Table II) about the WP4 developments at international Conferences and outreach activities, which have been specific dissemination actions beyond the ones already showed by the consortium as a whole. To those, other dissemination activities carried out as part of CHAIN-REDS events must be added. Event Date Location Type of contribution e-age 2012 Dec 2012 Dubai (UAE) Presentation Data Infrastructures in CHAIN- REDS SCALAC 2013 Feb 2013 Bucaramanga (Colombia) Presentation CIEMAT ISGC 2013 Mar 2013 Taipei (Taiwan) EGI CF 2013 Apr 2013 Manchester (UK) IST-Africa 2013 May 2013 Nairobi (Kenya) EGI TF 2013 Sep 2013 Madrid (Spain) eresearch Africa 2013 Oct 2013 Cape Town (South Africa) UbuntuNet Connect 2013 Nov 2013 Kigali (Rwanda) RedCLARA Virtual day Nov 2013 Latin America Presentation Data Infrastructures in CHAIN- REDS Presentation Support to Data Infrastructures in CHAIN-REDS Presentation The Knowledge Base of Open Access Document Repositories (OADRs) and How African Libraries can Contribute to it Presentation Support for VRCs outside of Europe - services by the CHAIN- REDS project Presentation Data Infrastructures for e- Science (the CHAIN- REDS perspective) Presentation Virtual Research Communities: Knowledge and Data Presentation Virtual Research Communities: Knowledge and Data Table II. WP4 dissemination and outreach activities in non-chain-reds events. To these presentations, the attendance by the WP4 coordinator to the 2 nd EUDAT conference held in Rome (Italy) in Oct 2013 should be added, since new plans of collaborations were set up as previously mentioned. 19

20 CHAIN-REDS Project - Deliverable D4.2 Page #20 In addition, some papers have been accepted for publication through this first year of the project. They are listed in Table III. Title The CHAIN-REDS Semantic Search Engine A CHAIN-REDS Perspective about Data Access and Metadata Management Reference Proceedings of the UbuntuNet Connect 2013 Conference, in press. Proceedings of the e-age 2013 Conference, in press. Table III. Papers accepted for publication through the first year of CHAIN-REDS. 20

21 CHAIN-REDS Project - Deliverable D4.2 Page #21 3. Analysis on Data Infrastructures under the CHAIN- REDS perspective Once the several communities have been approached and the WP4 survey on DRs and OADRs was ended, an analysis about the commonalities, differences, requirements and future challenges that these communities have regarding the computing and data infrastructure they use has been carried out. Of course, such an analysis deserves to bear in mind the different specificities that the regions of interest to CHAIN-REDS have and its conclusions must be aligned with the European vision about data curation and management Standards In order to set up a data trust building, it is mandatory to define a set of standards that will rule the data management for facilitating its further use. During the first six months of its lifetime, CHAIN-REDS identified some of them. The project has been working with them as it will be stated in Section 5 and has analysed the new standards that could be of interest, i.e., Std5. The current list of standards is the following: Std1. OAI-PMH 36 for metadata retrieval. The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a low-barrier mechanism for repository interoperability. The service works in a way that data providers are repositories that expose structured metadata via OAI-PMH and service providers then make OAI-PMH service requests to harvest that metadata. OAI-PMH is a set of six verbs or services that are invoked within HTTP. Std2. Dublin Core 37 as metadata schema. Specifically, the DCMI is an open organization that has defined a set of vocabulary terms that can be used to describe resources for the purposes of discovery. The terms can be used to describe a full range of web resources, physical resources and objects. The original set of classic metadata terms, known as the Dublin Core Metadata Element Set, counted on 15 entries. Std3. SPARQL 38 for semantic web search. Resource Description Framework (RDF) is a standard model for data interchange on the Web. It is a directed, labelled graph data format for representing information in the Web. This specification defines the syntax and semantics of the SPARQL query language for RDF. SPARQL can be used to express queries across diverse data sources, whether the data is stored natively as RDF or viewed as RDF via middleware. SPARQL contains capabilities for querying required and optional graph patterns along with their conjunctions and disjunctions. SPARQL also supports extensible value testing and constraining queries by source RDF graph. The results of SPARQL queries can be results sets or RDF graphs Std4. XML 39 as potential standard for the interchange of data represented as a set of tables. Extensible Markup Language (XML) is a simple, very flexible text format derived from SGML (ISO 8879). Originally designed to meet the challenges of large-scale electronic publishing, XML is also playing an increasingly important role in the exchange of a wide variety of data on the Web and elsewhere. Std5. Persistent Identifier (PID) is defined as a long-lasting reference to a digital object, being it either a single file or set of files. Noted persistent identifier

22 CHAIN-REDS Project - Deliverable D4.2 Page #22 systems include: Archival Resource Keys (ARKs), Digital Object Identifiers (DOIs), Persistent Uniform Resource Locators (PURLs), Uniform Resource Names (URNs), and Extensible Resource Identifiers (XRIs). The use of PIDs is out utmost importance for identifying data on a long-term basis and facilitating their discovery; as a consequence, the European Persistent Identifier Consortium 40 (EPIC) has been already established as an initiative devoted to provide PID Services for the European Research Community, based on the handle system 41, for the allocation and resolution of persistent identifiers. Raw data description aginfra ENGAGE EarthServer Mainly articles and referatories, i.e., they do not hold the content, but aggregate metadata from many sources (more information can be seen in aginfra deliverable D2.3) Metadata are stored in a postgresql database. The actual raw data linked to the datasets are either direct URLs to external files (of other websites), or local files stored in a cloud storage. Large part of Earth Observations metadata is stored in XML in practice. Versatile retrieval methods are on the OAI-PMH Yes No Some repositories way have OAI-PMH endpoints and a catalogue of services, supporting OAI-PMH queries, is being built. Dublin Core Yes Yes No SPARQL Yes Yes No XML N/A XML is used for the RDF files (metadata & data) stored in Virtuoso. For the rest communication formats, JSON (API, visualizations, etc.) is usually employed. PID No No No Intellectual property issues Are you interested in adopting the missing standards (if any) with the CHAIN- REDS support? Majority of open data sets Majority of open data sets Yes Yes Yes Yes Some data sets Table IIII. Standards adopted by CHAIN-REDS and their status in the collaborative VRCs (Nov 2013)

23 CHAIN-REDS Project - Deliverable D4.2 Page #23 Bearing them in mind, CHAIN-REDS has started to build a data challenge that will be used as a proof-of-concept for data trust building, so the first action has been to enquire the identified VRCs about the adoption of these standards. A brief survey of the current status has been sent to three of the communities, the result of which is summarised in Table IV. With these results, some preliminary conclusions can be deducted: - Raw data is usually kept in its own format and storage allocation, so (big) related initiatives basically collect the links to these primitive repositories. - Some CHAIN-REDS promoted standards have been basically adopted. OAI-PMH is not accepted in the ENGAGE project, and a closer collaboration has been agreed between the two projects for its further implementation. Then, the data retrieval and reuse seem feasible. - PIDs are not generally used, although their value is acknowledged, and CHAIN- REDS will work towards their promotion in these communities. - Although it cannot be applied to the whole set of repositories, the collaborative initiatives work with datasets that are not under intellectual property limitations. Last, three more points should be raised. The first two of them refer to time scheduled issues: the astrophysics (IVOA) and Cultural Heritage (DCH-RP) communities are being expected to be added to this round of contacts through 2014 and the CHAIN-REDS support for promoting the proposed standards is expected to be carried out during the scientific data challenge whose code name has been set to DART (Data Accessibilty, Reproducibility and Trustworthiness) Transcontinental coverage CHAIN-REDS supports the interoperation of Grids in Europe and other world regions through the support for Regional Operations Centres in terms of functionality, requirements and structure. In this regard, WP3 Interoperation and coordination of einfrastructures has described in D3.1 Interoperation model and plan 4 the current operations structure of the European Grid Infrastructure and a model for Grid interoperations between Europe and the rest of world regions involved in the project. In addition, there is also an actualization of the interoperation model to be implemented through the execution of a concrete action plan drafted in cooperation with the representatives of each region. It is the aim of this document to neither repeat the interoperation model between Europe and the regions participating in this project nor the blueprint for the implementation of Regional Operation Centres depending on two possible scenarios for interoperation with Europe. Nevertheless, some hints about the current status of the different regions will be described in order to better assess on which infrastructures the DART challenge should run: - The Africa & Arabia Region count on a ROC and 13 sites (May 2013), although new ones were on the way for their incorporation. They are using the middleware and tools promoted by EMI and EGI, although some services are not yet installed. - The Asia-Pacific Region count on a ROC and 27 sites (May 2013) running, also EMI compliant. The services promoted by EGI have been adopted and only the signature of a MoU is advisable. - China counts on two Resource Infrastructure Providers, China ROC and CNGrid. The former counts on 1 Resource Centre and still lack some EGI services and the latter is a federation of 14 High Performance Computing Resource Centres using Grid Operating System (GOS) middleware. - India counts on 8 Resource Centres running a mixture of Globus Toolkit versions, 23

24 CHAIN-REDS Project - Deliverable D4.2 Page #24 where a set of higher-level middleware services has been developed. - The Latin American Region counts on 4 sites running operationally within the ROC LA with the EMI middleware. New sites coming from the old IGALC ROC are expected to be incorporated very soon. Management services are implemented and comply with the EGI model. With regard to clouds, it is clear that it requires significant programming and system administration support and many gaps and challenges exist in current open-source virtualized cloud software stacks for both production science and education use. In addition, interoperation between commercial and scientific clouds ought to be addressed. Based on the EGI Federated Cloud Task force 42, which is entering a Pre-Production phase and is inviting other cloud infrastructure providers to join its federated infrastructure, CHAIN-REDS will start surveying the regions of interest for collecting information about Clouds for research & education purposes. In addition, CHAIN-REDS is also organising a set of round tables to get a valuable feedback from the different regions; the first of them was held in the CHAIN-REDS workshop allocated with IEEE e- Science Conference 2013 in Beijing (Oct 2013). Relying on the collected information, suggestions towards cloud interoperability will be proposed. WP3 and W4 are working together for designing a data trust building challenge that will profit from the different computing facilities distributed worldwide and will provide a legacy for the future use of both data and computational infrastructure. The first action has been the action plan per region described in D3.1; such recommendations are designed to better interoperate the different ROCs around the world with the European strategy, widening the current computing power. This being achieved, it will be easier to extend the previous demonstrated interoperability demo (EGI TF 2013) to more sites allocated in different world regions. The second action has been to accelerate the access to the different infrastructures by means of Identity Federations. D5.1 4 takes also into account this feature and reports about the current status worldwide. One of the potential identified management solutions for Identity Federation issues is the Perun service. The Perun service 43 is a user, resource and service management system developed by the CHAIN-REDS partner CESNET and provided as a service. Since Perun could help research communities to overcome initial barriers, when they want to start using federated services or simply want to manage access to the shared services, it is being considered by CHAIN-REDS to be adopted as support tool. In addition, CHAIN-REDS is also working on PID services. The project partner GRNET is deeply involved in this development and already provides a PID service 44. The service uses the Handle System for PID resolution and assignment, and it has been assigned a prefix by the Corporation for National Research Initiatives (CNRI). Furthermore, GRNET provides a REST web interface in front of the Handle service that eases the integration with the software repositories. As it is evident from Table IV, many VRCs do not have experience in using PID services as part of their data management practices. CHAIN- REDS is committed to promote the adoption of PIDs in chosen use cases and the wider community, and will provide them with the necessary support. Bearing all these aspects in mind, it will be feasible to achieve the final data trust building challenge that is described in the next Section

25 CHAIN-REDS Project - Deliverable D4.2 Page #25 4. The CHAIN-REDS tools In this Section, a brief description of the CHAIN-REDS tools that are available at its website and are of interest to Data Infrastructure is provided. They actually are the backbone that will be used by the future data challenge called DART that aims to provide a proof-of-principle data trust building The CHAIN-REDS Knowledge Base The CHAIN-REDS Knowledge Base is one of the largest existing e-infrastructure-related digital information systems. It currently contains information, gathered both from dedicated surveys and other web and documental sources, for largely more than half of the countries in the world. Information is presented to visitors through geographic maps and tables. The country view is shown in Fig. 2. Users can choose a continent in the map and, for each country where a marker is displayed, get the information about the Regional Research & Education Network(s) and the Grid Regional Operation Centre(s) the country belongs to as well as the National Research & Education Network, the National Grid Initiative, the Certification Authority, and the Identity Federation available in the country, down to the Grid site(s) running in the country and the scientific application(s) developed by researchers of the country and running on those sites. Besides e-infrastructure sites, services and applications, the CHAIN-REDS KB publishes information about Open Access Document Repositories and Data Repositories as it has been described in Section 2. Deeper explanation about the KB characteristics can be also found in D4.1. Figure 2. A snapshot of the CHAIN-REDS Knowledge Base - country view 25

26 CHAIN-REDS Project - Deliverable D4.2 Page #26 Although it is quite useful to have a central access point to thousands of repositories and millions of documents and datasets, with both geographic and tabular information, the OADR and DR part of the CHAIN-REDS KB is only a demonstrator with limited impact on scientists day-by-day life. In order to find a document or a dataset, users should know beforehand what they are looking for and there is no way to correlate documents and data which would actually be of the most important facilitators. In order to overcome these limitations and turn the KB into a powerful research tool, the CHAIN-REDS consortium has decided to semantically enrich the OADRs and DRs gathered in the KB and build a search engine on the related linked data. The CHAIN-REDS Semantic Search Engine has been the result of such an effort led by INFN The CHAIN-REDS Semantic Search Engine The multi-layered architecture of the search engine is sketched in Figure 3 where both the official and de facto Semantic Web standards and technologies adopted are described by small logos. Starting from the bottom of Figure 3, the first two components of the service are described below. Figure 3. Architecture of the Semantic Search Engine The metadata harvester is a process able to run both on Grid and Cloud infrastructures which consists of the following parts: Get the address of each repository publishing an OAI-PMH standard endpoint; Retrieve, using the OAI-PMH repository address, the related Dublin Core encoded metadata in XML format; Get the records from the XML files and, using the Apache Jena API, transform the metadata in RDF format; Save the RDF files into a Virtuoso triple store according to an OWL-compliant ontology built using Protégé. Each Resource Description Framework (RDF) file retrieved and saved in a Virtuoso 45 - enabled triple store is mapped onto a Virtuoso Graph that contains the ontology expressly developed for the search engine, shown in Figure 4 for the sake of completeness. The ontology, built using Dublin Core and FOAF standards, consists of:

27 CHAIN-REDS Project - Deliverable D4.2 Page #27 Classes that describe the general concepts of the domain: Resource, Author, Organisation, Repository and Dataset (where Resource is a given open access document); Object properties that describe the relationships among the ontology classes; the ontology developed for the service described in this paper has several specific properties such as hasauthor (i.e., the relation between Resources and Authors) and hasdataset (i.e., the relation between Resources and Datasets); Data properties (or attributes) that contain the characteristics or classes parameters. Figure 4. Schema of the ontology used for the Semantic Search Engine. The third, and highest-level, component is the Search Engine itself. Using it, visitors can either enter a keyword and submit a SPARQL query to the Virtuoso triple store or select a language and get, on the left side of the page, the list of subjects available in that language with the indication, between parentheses, of the number of records available for that particular subject (see Figure 5). 27

28 CHAIN-REDS Project - Deliverable D4.2 Page #28 Figure 5. Schema of the ontology used for the Semantic Search Engine. The results of a given query are listed in a summary view directly displayed on the webpage. For each record found, the title, the author(s) and a short description of the corresponding resource are provided. Clicking on the More Info link, visitors can access the detailed view of the resource. In the Dataset information panel users get the link to the open access document and, if existing, to the corresponding dataset. Clicking on the Graphs tab, which appears at the top of the summary view, users can select one or more of the resources found and get a graphic view of the semantic connections among Authors, Subjects and Publishers, as shown in Figure 6. In this way, if new links appear, connecting different resources (as shown in the lower left corner of the figure), users can infer new relations among resources, thus discovering new knowledge. It is worth mentioning that this part of the Semantic Search Engine is at prototypal stage and is subject to changes and improvements in the coming months. Figure 6. Graphic connections among records found by the Semantic Search Engine. Further to these capabilities, a very important one has been recently implemented: to perform either single 46 or parallel semantic searching 22. By passing the mouse over the "Semantic Search" link of the CHAIN-REDS webpage, any user can see a sub-menu with two items: Single: the usual semantic search service described above; and,

29 CHAIN-REDS Project - Deliverable D4.2 Page #29 Parallel: the new parallel semantic search service that allow users to search in parallel (i.e., at the same time) across the millions of resources contained in the CHAIN-REDS Knowledge Base and in the ENGAGE Platform. Parallel semantic search engines have been made available also in the Science Gateways of some (collaborating) projects, enhancing and extending in this way the solutions proposed by CHAIN-REDS. This parallel semantic search can be found at: aginfra 18, here the user can search in parallel across the millions of resources contained in the CHAIN-REDS Knowledge Base and in the OpenAgris repository; and, DCH-RP 47, here the user can search in parallel across the tens of millions of resources contained in the CHAIN-REDS Knowledge Base and in the Europeana 48, Cultura Italia 49 and Isidore 50 repositories. A snapshot of these parallel semantic search webpages is depicted in Figure 7, where it is clearly displayed for the user s knowledge, the repositories that are included in the parallel semantic search

30 CHAIN-REDS Project - Deliverable D4.2 Page #30 Figure 7. The parallel semantic search webpages of aginfra (up) and DCH-RP (bottom). A programmable use of the CHAIN-REDS Semantic Search Engine is also possible due to the very recent development of a RESTful API that has been created on purpose; now, it is possible to get and/or re-use the many millions of open access resources contained in the CHAIN-REDS Knowledge Base and stored in a Virtuoso RDF-compliant database by calling the Semantic Search Engine from a common website or even mobile application ( An example of which command to type and how the information is displayed is depicted in Figure 8. 30

31 CHAIN-REDS Project - Deliverable D4.2 Page #31 Figure 8. How the information is displayed by using the RESTful API with the CHAIN-REDS Semantic Search Engine. For legibility, the request that appears in the web browser has searched for 10 resources that contain the keyword eye inside the title; the text to be pasted on the web browser direction bar is 31

A CHAIN-REDS solution for accessing computational services

A CHAIN-REDS solution for accessing computational services Cuarta Conferencia de Directores de Tecnología de Información, TICAL2014 Gestión de las TICs para la Investigación y la Colaboración, Cancún, del 26 al 28 de mayo de 2014 A CHAIN-REDS solution for accessing

More information

D1.3 Data Management Plan

D1.3 Data Management Plan Funded by the European Union s H2020 Programme D1.3 Data Management Plan 1 PROJECT DOCUMENTATION SHEET Project Acronym Project Full Title : TANDEM : TransAfrican Network Development Grant Agreement : GA

More information

The astronomical Virtual Observatory : lessons learnt, looking forward. Françoise Genova - Forum VO-PDC d après ADASS XXI, Paris, nov.

The astronomical Virtual Observatory : lessons learnt, looking forward. Françoise Genova - Forum VO-PDC d après ADASS XXI, Paris, nov. The astronomical Virtual Observatory : lessons learnt, looking forward Examples taken from the European view, but other projects have followed similar paths The VO aim Enable seamless access to the wealth

More information

Funded by the European Union s H2020 Programme. D4.1 Virtual Collaboration Platform

Funded by the European Union s H2020 Programme. D4.1 Virtual Collaboration Platform Funded by the European Union s H2020 Programme D4.1 Virtual Collaboration Platform 1 PROJECT DOCUMENTATION SHEET Project Acronym Project Full Title : TANDEM : TransAfrican Network Development Grant Agreement

More information

Research Data Alliance: Current Activities and Expected Impact. SGBD Workshop, May 2014 Herman Stehouwer

Research Data Alliance: Current Activities and Expected Impact. SGBD Workshop, May 2014 Herman Stehouwer Research Data Alliance: Current Activities and Expected Impact SGBD Workshop, May 2014 Herman Stehouwer The Vision 2 Researchers and innovators openly share data across technologies, disciplines, and countries

More information

Your door to future governance solutions

Your door to future governance solutions Your door to future governance solutions www.egovlab.eu 2 3 not just in theory but also in practice 4 5 www.egovlab.eu * Word from egovlab s director Vasilis Koulolias: The power of information and communication

More information

D3.3.1: Sematic tagging and open data publication tools

D3.3.1: Sematic tagging and open data publication tools COMPETITIVINESS AND INNOVATION FRAMEWORK PROGRAMME CIP-ICT-PSP-2013-7 Pilot Type B WP3 Service platform integration and deployment in cloud infrastructure D3.3.1: Sematic tagging and open data publication

More information

OpenAIRE Research Data Management Briefing paper

OpenAIRE Research Data Management Briefing paper OpenAIRE Research Data Management Briefing paper Understanding Research Data Management February 2016 H2020-EINFRA-2014-1 Topic: e-infrastructure for Open Access Research & Innovation action Grant Agreement

More information

Workprogramme 2014-15

Workprogramme 2014-15 Workprogramme 2014-15 e-infrastructures DCH-RP final conference 22 September 2014 Wim Jansen einfrastructure DG CONNECT European Commission DEVELOPMENT AND DEPLOYMENT OF E-INFRASTRUCTURES AND SERVICES

More information

Horizon 2020. Research e-infrastructures Excellence in Science Work Programme 2016-17. Wim Jansen. DG CONNECT European Commission

Horizon 2020. Research e-infrastructures Excellence in Science Work Programme 2016-17. Wim Jansen. DG CONNECT European Commission Horizon 2020 Research e-infrastructures Excellence in Science Work Programme 2016-17 Wim Jansen DG CONNECT European Commission 1 Before we start The material here presented has been compiled with great

More information

How To Understand And Understand The Science Of Astronomy

How To Understand And Understand The Science Of Astronomy Introduction to the VO Christophe.Arviset@esa.int ESAVO ESA/ESAC Madrid, Spain The way Astronomy works Telescopes (ground- and space-based, covering the full electromagnetic spectrum) Observatories Instruments

More information

FURNIT-SAVER Smart Augmented and Virtual Reality Marketplace for Furniture Customisation. Data Management Plan

FURNIT-SAVER Smart Augmented and Virtual Reality Marketplace for Furniture Customisation. Data Management Plan Ref. Ares(2015)5634918-07/12/2015 FURNIT-SAVER Smart Augmented and Virtual Reality Marketplace for Furniture Customisation D6.2 Grant Agreement Number 645067 Call identifier ICT-18-2014 Project Acronym

More information

Big Data in the Digital Cultural Heritage

Big Data in the Digital Cultural Heritage Big Data in the Digital Cultural Heritage Antonella Fresa, Promoter Srl DCH-RP Technical Coordinator 1 Table of Content Digitisation of Cultural Heritage Toward an e-infrastructure for Digital Cultural

More information

Scientific Data Infrastructure: activities in the Capacities Programme of FP7

Scientific Data Infrastructure: activities in the Capacities Programme of FP7 Scientific Data Infrastructure: activities in the Capacities Programme of FP7 Presentation at the PARSE.Insight Workshop, Darmstadt, 21 September 2009 Carlos Morais Pires European Commission - DG INFSO

More information

How To Help The European Single Market With Data And Information Technology

How To Help The European Single Market With Data And Information Technology Connecting Europe for New Horizon European activities in the area of Big Data Márta Nagy-Rothengass DG CONNECT, Head of Unit "Data Value Chain" META-Forum 2013, 19 September 2013, Berlin OUTLINE 1. Data

More information

ENHANCED PUBLICATIONS IN THE CZECH REPUBLIC

ENHANCED PUBLICATIONS IN THE CZECH REPUBLIC ENHANCED PUBLICATIONS IN THE CZECH REPUBLIC PETRA PEJŠOVÁ, HANA VYČÍTALOVÁ petra.pejsova@techlib.cz, hana.vycitalova@techlib.cz The National Library of Technology, Czech Republic Abstract The aim of this

More information

European Data Infrastructure - EUDAT Data Services & Tools

European Data Infrastructure - EUDAT Data Services & Tools European Data Infrastructure - EUDAT Data Services & Tools Dr. Ing. Morris Riedel Research Group Leader, Juelich Supercomputing Centre Adjunct Associated Professor, University of iceland BDEC2015, 2015-01-28

More information

INTEGRATING RECORDS SYSTEMS WITH DIGITAL ARCHIVES CURRENT STATUS AND WAY FORWARD

INTEGRATING RECORDS SYSTEMS WITH DIGITAL ARCHIVES CURRENT STATUS AND WAY FORWARD INTEGRATING RECORDS SYSTEMS WITH DIGITAL ARCHIVES CURRENT STATUS AND WAY FORWARD National Archives of Estonia Kuldar As National Archives of Sweden Karin Bredenberg University of Portsmouth Janet Delve

More information

Exploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing.

Exploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing. Exploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing. Dr Liz Lyon, UKOLN, University of Bath Introduction and Objectives UKOLN is undertaking

More information

e-infrastructures in Horizon 2020 Vision, approach, drivers, policy background, challenges, WP structure INFODAY France Paris, 25 mars 2014

e-infrastructures in Horizon 2020 Vision, approach, drivers, policy background, challenges, WP structure INFODAY France Paris, 25 mars 2014 e-infrastructures in Horizon 2020 Vision, approach, drivers, policy background, challenges, WP structure INFODAY France Paris, 25 mars 2014 Jean-Luc Dorel European Commission DG CNECT einfrastructure Vision

More information

Scalable End-User Access to Big Data http://www.optique-project.eu/ HELLENIC REPUBLIC National and Kapodistrian University of Athens

Scalable End-User Access to Big Data http://www.optique-project.eu/ HELLENIC REPUBLIC National and Kapodistrian University of Athens Scalable End-User Access to Big Data http://www.optique-project.eu/ HELLENIC REPUBLIC National and Kapodistrian University of Athens 1 Optique: Improving the competitiveness of European industry For many

More information

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan Metadata for Data Discovery: The NERC Data Catalogue Service Steve Donegan Introduction NERC, Science and Data Centres NERC Discovery Metadata The Data Catalogue Service NERC Data Services Case study:

More information

D5.5 Initial EDSA Data Management Plan

D5.5 Initial EDSA Data Management Plan Project acronym: Project full : EDSA European Data Science Academy Grant agreement no: 643937 D5.5 Initial EDSA Data Management Plan Deliverable Editor: Other contributors: Mandy Costello (Open Data Institute)

More information

European University Association Contribution to the Public Consultation: Science 2.0 : Science in Transition 1. September 2014

European University Association Contribution to the Public Consultation: Science 2.0 : Science in Transition 1. September 2014 European University Association Contribution to the Public Consultation: Science 2.0 : Science in Transition 1 September 2014 With 850 members across 47 countries, the European University Association (EUA)

More information

EGI services for distribution and federation of data and computing

EGI services for distribution and federation of data and computing EGI services for distribution and federation of data and computing Tiziana Ferrari Technical Director, EGI.eu tiziana.ferrari@egi.eu March 2014 EGI-InSPIRE RI-261323 1 Accelerating Excellent Science MISSION.

More information

SEVENTH FRAMEWORK PROGRAMME THEME ICT -1-4.1 Digital libraries and technology-enhanced learning

SEVENTH FRAMEWORK PROGRAMME THEME ICT -1-4.1 Digital libraries and technology-enhanced learning Briefing paper: Value of software agents in digital preservation Ver 1.0 Dissemination Level: Public Lead Editor: NAE 2010-08-10 Status: Draft SEVENTH FRAMEWORK PROGRAMME THEME ICT -1-4.1 Digital libraries

More information

Certification of Electronic Health Record systems (EHR s)

Certification of Electronic Health Record systems (EHR s) Certification of Electronic Health Record systems (EHR s) The European Inventory of Quality Criteria Georges J.E. DE MOOR, M.D., Ph.D. EUROREC EuroRec The «European Institute for Health Records» A not-for-profit

More information

CERN s Scientific Programme and the need for computing resources

CERN s Scientific Programme and the need for computing resources This document produced by Members of the Helix Nebula consortium is licensed under a Creative Commons Attribution 3.0 Unported License. Permissions beyond the scope of this license may be available at

More information

COMMITTEE ON STANDARDS AND TECHNICAL REGULATIONS (98/34 COMMITTEE)

COMMITTEE ON STANDARDS AND TECHNICAL REGULATIONS (98/34 COMMITTEE) EUROPEAN COMMISSION ENTERPRISE AND INDUSTRY DIRECTORATE-GENERAL Regulatory Policy Standardisation Brussels, 9 th November 2005 Doc.: 34/2005 Rev. 1 EN COMMITTEE ON STANDARDS AND TECHNICAL REGULATIONS (98/34

More information

Workspaces Concept and functional aspects

Workspaces Concept and functional aspects Mitglied der Helmholtz-Gemeinschaft Workspaces Concept and functional aspects A You-tube for science inspired by the High Level Expert Group Report on Scientific Data 21.09.2010 Morris Riedel, Peter Wittenburg,

More information

Standard Big Data Architecture and Infrastructure

Standard Big Data Architecture and Infrastructure Standard Big Data Architecture and Infrastructure Wo Chang Digital Data Advisor Information Technology Laboratory (ITL) National Institute of Standards and Technology (NIST) wchang@nist.gov May 20, 2016

More information

THE BRITISH LIBRARY. Unlocking The Value. The British Library s Collection Metadata Strategy 2015-2018. Page 1 of 8

THE BRITISH LIBRARY. Unlocking The Value. The British Library s Collection Metadata Strategy 2015-2018. Page 1 of 8 THE BRITISH LIBRARY Unlocking The Value The British Library s Collection Metadata Strategy 2015-2018 Page 1 of 8 Summary Our vision is that by 2020 the Library s collection metadata assets will be comprehensive,

More information

GEOTHERMAL ERA-NET: WP3: Towards a European Geothermal Database

GEOTHERMAL ERA-NET: WP3: Towards a European Geothermal Database GEOTHERMAL ERA-NET: WP3: Towards a European Geothermal Database WP3 status Adele Manzella Eugenio Trumpy CNR Organisational structure / work packages WP3 Towards a European Geothermal Information was:

More information

OPENGREY: HOW IT WORKS AND HOW IT IS USED

OPENGREY: HOW IT WORKS AND HOW IT IS USED OPENGREY: HOW IT WORKS AND HOW IT IS USED CHRISTIANE STOCK christiane.stock@inist.fr INIST-CNRS, France Abstract OpenGrey is a unique repository providing open access to European grey literature references,

More information

This vision will be accomplished by targeting 3 Objectives that in time are further split is several lower level sub-objectives:

This vision will be accomplished by targeting 3 Objectives that in time are further split is several lower level sub-objectives: Title: Common solution for the (very-)large data challenge Acronym: VLDATA Call: EINFRA-1 (Focus on Topic 5) Deadline: Sep. 2nd 2014 This proposal complements: Title: e-connecting Scientists Call: EINFRA-9

More information

Carlos Iglesias, Open Data Consultant.

Carlos Iglesias, Open Data Consultant. Carlos Iglesias, Open Data Consultant. contact@carlosiglesias.es http://es.linkedin.com/in/carlosiglesiasmoro/en @carlosiglesias mobile: +34 687 917 759 Open Standards enthusiast and Open advocate that

More information

Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020

Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020 Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020 Version 1.0 11 December 2013 Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020

More information

A strategic roadmap for federated service management

A strategic roadmap for federated service management Managing e-infrastructures successfully: A strategic roadmap for federated service management The gslm project - www.gslm.eu Version 1.5 Documentinformation: ThisdocumentwaspreparedasadeliverableforthegSLMproject(www.gslm.eu)andisalsoreleasedasD6.3:Strategic

More information

African-European Radio Astronomy Platform. 2013 Africa-EU Cooperation Forum on ICT. Addis Ababa, Ethiopia 3 December 2013

African-European Radio Astronomy Platform. 2013 Africa-EU Cooperation Forum on ICT. Addis Ababa, Ethiopia 3 December 2013 African-European Radio Astronomy Platform 2013 Africa-EU Cooperation Forum on ICT Addis Ababa, Ethiopia 3 December 2013 Context The African European Radio Astronomy Platform European Parliament s Written

More information

D3.2 - Study: Recommendations for African e-infrastructure development

D3.2 - Study: Recommendations for African e-infrastructure development Seventh Framework Programme FP7 Capacities Specific Programme Research Infrastructures - Call 10 FP7 Infrastructures Call 10 (FP7-INFRASTRUCTURES-2012-1) Strategic Objective 1.3 Support to policy development

More information

Microsoft Research Worldwide Presence

Microsoft Research Worldwide Presence Microsoft Research Worldwide Presence MSR India MSR New England Redmond Redmond, Washington Sept, 1991 San Francisco, California Jun, 1995 Cambridge, United Kingdom July, 1997 Beijing, China Nov, 1998

More information

Outcomes of the CDS Technical Infrastructure Workshop

Outcomes of the CDS Technical Infrastructure Workshop Outcomes of the CDS Technical Infrastructure Workshop Baudouin Raoult Baudouin.raoult@ecmwf.int Funded by the European Union Implemented by Evaluation & QC function C3S architecture from European commission

More information

Portal Version 1 - User Manual

Portal Version 1 - User Manual Portal Version 1 - User Manual V1.0 March 2016 Portal Version 1 User Manual V1.0 07. March 2016 Table of Contents 1 Introduction... 4 1.1 Purpose of the Document... 4 1.2 Reference Documents... 4 1.3 Terminology...

More information

Call for experts for INSPIRE maintenance & implementation

Call for experts for INSPIRE maintenance & implementation INSPIRE Infrastructure for Spatial Information in Europe Call for experts for INSPIRE maintenance & implementation Title Creator Call for experts for INSPIRE maintenance & implementation EC & EEA INSPIRE

More information

data infrastructures framework for action for H2020

data infrastructures framework for action for H2020 data infrastructures framework for action for H2020 Event Open Access Policy in Portugal Lisbon, 17 June 2013 Carlos Morais Pires European Commission e-infrastructures, DG CNECT.C1 Author s views do not

More information

Test of cloud federation in CHAIN-REDS project

Test of cloud federation in CHAIN-REDS project Test of cloud federation in CHAIN-REDS project Italian National Institute of Nuclear Physics, Division of Catania - Italy E-mail: giuseppe.andronico@ct.infn.it Roberto Barbera Department of Physics and

More information

HERON (No: 649690): Deliverable D.2.6 DATA MANAGEMENT PLAN AUGUST 2015. Partners: Oxford Brookes University and Università Commerciale Luigi Bocconi

HERON (No: 649690): Deliverable D.2.6 DATA MANAGEMENT PLAN AUGUST 2015. Partners: Oxford Brookes University and Università Commerciale Luigi Bocconi HERON (No: 649690): Deliverable D.2.6 DATA MANAGEMENT PLAN AUGUST 2015 Partners: Oxford Brookes University and Università Commerciale Luigi Bocconi Institutions: Low Carbon Building Group, Oxford Brookes

More information

ETSO Modelling Methodology for the Automation of Data Interchange of Business Processes (EMM)

ETSO Modelling Methodology for the Automation of Data Interchange of Business Processes (EMM) ETSO Modelling Methodology for the Automation of Data Interchange of Business Processes (EMM) Version : 1 Release : 4 Version 1 Release 4 04 December 2003 Page 1/19 Revision History Version Release Date

More information

Digital libraries of the future and the role of libraries

Digital libraries of the future and the role of libraries Digital libraries of the future and the role of libraries Donatella Castelli ISTI-CNR, Pisa, Italy Abstract Purpose: To introduce the digital libraries of the future, their enabling technologies and their

More information

International Collaboration on Research Data Infrastructure

International Collaboration on Research Data Infrastructure Project Acronym Project Title icordi International Collaboration on Research Data Infrastructure Project Number 312424 Deliverable Title Quality plan and Risk register Deliverable No. D1.1 Delivery Date

More information

EUDAT. Towards a pan-european Collaborative Data Infrastructure

EUDAT. Towards a pan-european Collaborative Data Infrastructure EUDAT Towards a pan-european Collaborative Data Infrastructure Damien Lecarpentier CSC-IT Center for Science, Finland EISCAT User Meeting, Uppsala,6 May 2013 2 Exponential growth Data trends Zettabytes

More information

Project Information. EDINA, University of Edinburgh Christine Rees Sheila Fraser sheila.fraser@ed.ac.uk 0131 651 7715

Project Information. EDINA, University of Edinburgh Christine Rees Sheila Fraser sheila.fraser@ed.ac.uk 0131 651 7715 Project Acronym - Project Title Project Information Scoping Study: Aggregations of Metadata for Images and Time-based Media Start Date June 2010 End Date 17 September 2010 Lead Institution Project Director

More information

Cloud and Big Data Standardisation

Cloud and Big Data Standardisation Cloud and Big Data Standardisation EuroCloud Symposium ICS Track: Standards for Big Data in the Cloud 15 October 2013, Luxembourg Yuri Demchenko System and Network Engineering Group, University of Amsterdam

More information

Big Data in the context of Preservation and Value Adding

Big Data in the context of Preservation and Value Adding Big Data in the context of Preservation and Value Adding R. Leone, R. Cosac, I. Maggio, D. Iozzino ESRIN 06/11/2013 ESA UNCLASSIFIED Big Data Background ESA/ESRIN organized a 'Big Data from Space' event

More information

ProSUM Prospecting Secondary raw materials in the Urban mine and Mining wastes

ProSUM Prospecting Secondary raw materials in the Urban mine and Mining wastes ProSUM Prospecting Secondary raw materials in the Urban mine and Mining wastes Background Project Summary The use of secondary raw materials and recycling was recommended to the European Commission as

More information

Standards for Big Data in the Cloud

Standards for Big Data in the Cloud Standards for Big Data in the Cloud International Cloud Symposium 15/10/2013 Carola Carstens (Project Officer) DG CONNECT, Unit G3 Data Value Chain European Commission Outline 1) Data Value Chain Unit

More information

How To Build An Open Source Data Infrastructure

How To Build An Open Source Data Infrastructure EUDAT Collaborative Data Infrastructure Towards the convergence of Compute, Data, Knowledge and Scientific Instruments Giuseppe Fiameni CINECA www.eudat.eu EUDAT receives funding from the European Union's

More information

Social Sentiment Analysis Financial IndeXes ICT-15-2014 Grant: 645425. D3.1 Data Requirement Analysis and Data Management Plan V1

Social Sentiment Analysis Financial IndeXes ICT-15-2014 Grant: 645425. D3.1 Data Requirement Analysis and Data Management Plan V1 Social Sentiment Analysis Financial IndeXes ICT-15-2014 Grant: 645425 D3.1 Data Requirement Analysis and Data Management Plan V1 Project Coordinator Dr. Brian Davis (NUI Galway) Document Authors Mr. Angelo

More information

A Big Picture for Big Data

A Big Picture for Big Data Supported by EU FP7 SCIDIP-ES, EU FP7 EarthServer A Big Picture for Big Data FOSS4G-Europe, Bremen, 2014-07-15 Peter Baumann Jacobs University rasdaman GmbH p.baumann@jacobs-university.de Our Stds Involvement

More information

Flexible Cloud Services to Compete

Flexible Cloud Services to Compete white paper Service Providers Need Flexible Cloud Services to Compete Enterprise Customers Demand Flexible Cloud Solutions When the concept of cloud services first came about, there was a great deal of

More information

dati.culturaitalia.it a Pilot Project of CulturaItalia dedicated to Linked Open Data

dati.culturaitalia.it a Pilot Project of CulturaItalia dedicated to Linked Open Data dati.culturaitalia.it a Pilot Project of CulturaItalia dedicated to Linked Open Data www.culturaitalia.it Rosa Caffo, Director of Central Institute for the Union Catalogue of Italian Libraries (MiBACT)

More information

Unlocking the True Potential of Usage Data. Amdocs White Paper November 2014

Unlocking the True Potential of Usage Data. Amdocs White Paper November 2014 Unlocking the True Potential of Usage Data Amdocs White Paper November 2014 UNLOCKING THE TRUE POTENTIAL OF USAGE DATA 2 With the continued pressure to differentiate and lead in a market suffering from

More information

Research Infrastructures in Horizon 2020

Research Infrastructures in Horizon 2020 Research Infrastructures in Horizon 2020 Philippe Froissard Deputy Head of Unit - Research Infrastructures European Commission DG Research & Innovation Research Infrastructures Research infrastructures

More information

Deploying Multiscale Applications on European e-infrastructures

Deploying Multiscale Applications on European e-infrastructures Deploying Multiscale Applications on European e-infrastructures 04/06/2013 Ilya Saverchenko The MAPPER project receives funding from the EC's Seventh Framework Programme (FP7/2007-2013) under grant agreement

More information

EDISON: Coordination and cooperation to establish new profession of Data Scientist for European Research and Industry

EDISON: Coordination and cooperation to establish new profession of Data Scientist for European Research and Industry EDISON: Coordination and cooperation to establish new profession of Data Scientist for European Research and Industry Yuri Demchenko University of Amsterdam EDISON Education for Data Intensive Science

More information

JOIMAN: Joint Degree Management and Administration Network: Tackling Current Issues and Facing Future Challenges

JOIMAN: Joint Degree Management and Administration Network: Tackling Current Issues and Facing Future Challenges JOIMAN: Joint Degree Management and Administration Network: Tackling Current Issues and Facing Future 142650-LLP-1-2008-1-ERASMUS-ENW Final Report Public Part 142650-LLP-1-2008-1-ERASMUS-ENW 2 / 19 Project

More information

Linked Open Data Infrastructure for Public Sector Information: Example from Serbia

Linked Open Data Infrastructure for Public Sector Information: Example from Serbia Proceedings of the I-SEMANTICS 2012 Posters & Demonstrations Track, pp. 26-30, 2012. Copyright 2012 for the individual papers by the papers' authors. Copying permitted only for private and academic purposes.

More information

DRIVER Providing value-added services on top of Open Access institutional repositories

DRIVER Providing value-added services on top of Open Access institutional repositories DRIVER Providing value-added services on top of Open Access institutional repositories Dr Dale Peters Scientific Technical Manager : DRIVER SUB Goettingen Germany Gaining the momentum: Open Access and

More information

Cloud Readiness Workshop

Cloud Readiness Workshop Globalisation and economic pressures are changing the business landscape, increasing the pressure to expedite time-to-market with new products and services, while keeping costs down. In addition, for many

More information

Overcoming the Technical and Policy Constraints That Limit Large-Scale Data Integration

Overcoming the Technical and Policy Constraints That Limit Large-Scale Data Integration Overcoming the Technical and Policy Constraints That Limit Large-Scale Data Integration Revised Proposal from The National Academies Summary An NRC-appointed committee will plan and organize a cross-disciplinary

More information

8970/15 FMA/AFG/cb 1 DG G 3 C

8970/15 FMA/AFG/cb 1 DG G 3 C Council of the European Union Brussels, 19 May 2015 (OR. en) 8970/15 NOTE RECH 141 TELECOM 119 COMPET 228 IND 80 From: Permanent Representatives Committee (Part 1) To: Council No. prev. doc.: 8583/15 RECH

More information

MAGIC. Collaboration Tools and Agreements for Global Communities

MAGIC. Collaboration Tools and Agreements for Global Communities MAGIC Collaboration Tools and Agreements for Global Communities María José López and Brook Schofield RedCLARA and GÉANT Organization 22-10-15 ICT2015, Networking session: Towards global research e-infrastructures

More information

GENESI-DEC: a federative e-infrastructure for Earth Science data discovery, access, and on-demand processing

GENESI-DEC: a federative e-infrastructure for Earth Science data discovery, access, and on-demand processing GENESI-DEC: a federative e-infrastructure for Earth Science data discovery, access, and on-demand processing Roberto Cossu 1, Fabrizio Pacini 2, Fabrice Brito 3,Luigi Fusco 1, Eliana Li Santi 1, and Andrea

More information

Archive I. Metadata. 26. May 2015

Archive I. Metadata. 26. May 2015 Archive I Metadata 26. May 2015 2 Norstore Data Management Plan To successfully execute your research project you want to ensure the following three criteria are met over its entire lifecycle: You are

More information

ICSU World Data System Implementation Plan 2014 2015

ICSU World Data System Implementation Plan 2014 2015 V10.0 ICSU World Data System Implementation Plan 2014 2015 Trusted Data Services for Global Science To realize its Strategic Targets as stated in the Strategic Targets 2014 2018 document, the WDS Scientific

More information

Research Data Alliance - Research Data Sharing without barriers Big Data & Open Data Workshop 2014, Brussels 7-8 May 2014

Research Data Alliance - Research Data Sharing without barriers Big Data & Open Data Workshop 2014, Brussels 7-8 May 2014 Research Data Alliance - Research Data Sharing without barriers Big Data & Open Data Workshop 2014, Brussels 7-8 May 2014 Leif Laaksonen / RDA Europe Strong engagement and impact - Bottom-up meeting top-down

More information

Fact Sheet Intellectual Property rules within the Fusion for Energy contractual framework

Fact Sheet Intellectual Property rules within the Fusion for Energy contractual framework European IPR Helpdesk Fact Sheet Intellectual Property rules within the Fusion for Energy contractual framework November 2011 Introduction... 1 1. Grants & Procurement granted by F4E... 4 2. Background

More information

Carlos Iglesias, Open Data Consultant.

Carlos Iglesias, Open Data Consultant. Carlos Iglesias, Open Data Consultant. contact@carlosiglesias.es http://es.linkedin.com/in/carlosiglesiasmoro/en @carlosiglesias mobile: +34 687 917 759 Open Standards enthusiast and Open advocate that

More information

Towards a Cloud of Public Services

Towards a Cloud of Public Services Towards a Cloud of Public Services Public administrations are often organised in silos: monolithic architecture models make it difficult to re-use services for the development of new applications. What

More information

CEN and CENELEC response to the EC Consultation on Standards in the Digital Single Market: setting priorities and ensuring delivery January 2016

CEN and CENELEC response to the EC Consultation on Standards in the Digital Single Market: setting priorities and ensuring delivery January 2016 CEN Identification number in the EC register: 63623305522-13 CENELEC Identification number in the EC register: 58258552517-56 CEN and CENELEC response to the EC Consultation on Standards in the Digital

More information

Big Data Standardisation in Industry and Research

Big Data Standardisation in Industry and Research Big Data Standardisation in Industry and Research EuroCloud Symposium ICS Track: Standards for Big Data in the Cloud 15 October 2013, Luxembourg Yuri Demchenko System and Network Engineering Group, University

More information

DAME Astrophysical DAta Mining Mining & & Exploration Exploration GRID

DAME Astrophysical DAta Mining Mining & & Exploration Exploration GRID DAME Astrophysical DAta Mining & Exploration on GRID M. Brescia S. G. Djorgovski G. Longo & DAME Working Group Istituto Nazionale di Astrofisica Astronomical Observatory of Capodimonte, Napoli Department

More information

Observer Access to the Cherenkov Telescope Array

Observer Access to the Cherenkov Telescope Array Observer Access to the Cherenkov Telescope Array IRAP, Toulouse, France E-mail: jknodlseder@irap.omp.eu V. Beckmann APC, Paris, France E-mail: beckmann@apc.in2p3.fr C. Boisson LUTh, Paris, France E-mail:

More information

Worldwide Survey on Clouds for R&E

Worldwide Survey on Clouds for R&E Co-ordination & Harmonisation of Advanced e-infrastructures for Research and Education Data Sharing Worldwide Survey on Clouds for R&E Manuel Rodríguez, CIEMAT, on behalf of CHAIN-REDs project Rome, 27

More information

Frequently Asked Questions regarding European Innovation Partnerships

Frequently Asked Questions regarding European Innovation Partnerships May 2012 Frequently Asked Questions regarding European Innovation Partnerships 6 December 2010 FAQs 1. What are the objectives behind European innovation partnerships? 2. What concrete benefits can be

More information

Healthcare Coalition on Data Protection

Healthcare Coalition on Data Protection Healthcare Coalition on Data Protection Recommendations and joint statement supporting citizens interests in the benefits of data driven healthcare in a secure environment Representing leading actors in

More information

How To Write An Inspire Directive

How To Write An Inspire Directive INSPIRE Infrastructure for Spatial Information in Europe Detailed definitions on the INSPIRE Network Services Title Detailed definitions on the INSPIRE Network Services Creator Date 2005-07-22 Subject

More information

StratusLab project. Standards, Interoperability and Asset Exploitation. Vangelis Floros, GRNET

StratusLab project. Standards, Interoperability and Asset Exploitation. Vangelis Floros, GRNET StratusLab project Standards, Interoperability and Asset Exploitation Vangelis Floros, GRNET EGI Technical Forum 2011 19-22 September 2011, Lyon, France StratusLab is co-funded by the European Community

More information

REACCH PNA Data Management Plan

REACCH PNA Data Management Plan REACCH PNA Data Management Plan Regional Approaches to Climate Change (REACCH) For Pacific Northwest Agriculture 875 Perimeter Drive MS 2339 Moscow, ID 83844-2339 http://www.reacchpna.org reacch@uidaho.edu

More information

PRACE An Introduction Tim Stitt PhD. CSCS, Switzerland

PRACE An Introduction Tim Stitt PhD. CSCS, Switzerland PRACE An Introduction Tim Stitt PhD. CSCS, Switzerland High Performance Computing A Key Technology 1. Supercomputing is the tool for solving the most challenging problems through simulations; 2. Access

More information

D5.3.2b Automatic Rigorous Testing Components

D5.3.2b Automatic Rigorous Testing Components ICT Seventh Framework Programme (ICT FP7) Grant Agreement No: 318497 Data Intensive Techniques to Boost the Real Time Performance of Global Agricultural Data Infrastructures D5.3.2b Automatic Rigorous

More information

Semantic Interoperability

Semantic Interoperability Ivan Herman Semantic Interoperability Olle Olsson Swedish W3C Office Swedish Institute of Computer Science (SICS) Stockholm Apr 27 2011 (2) Background Stockholm Apr 27, 2011 (2) Trends: from

More information

Governance, Risk and Compliance Assessment

Governance, Risk and Compliance Assessment Governance, Risk and Compliance Assessment Information security is a pervasive business requirement and one that no organisation can afford to get wrong. If it s not handled properly, your business could

More information

Systems Engineering Tools Integration and Interoperability using OSLC in the SPRINT project

Systems Engineering Tools Integration and Interoperability using OSLC in the SPRINT project Systems Engineering Tools Integration and Interoperability using OSLC in the SPRINT project Andreas Keis, Parham Vasaiely (EADS Innovation Works, Newport) Uri Shani (IBM Israel Science and Technology Ltd.,

More information

Digital preservation a European perspective

Digital preservation a European perspective Digital preservation a European perspective Pat Manson Head of Unit European Commission DG Information Society and Media Cultural Heritage and Technology Enhanced Learning Outline The digital preservation

More information

A PLATFORM FOR SHARING DATA FROM FIELD OPERATIONAL TESTS

A PLATFORM FOR SHARING DATA FROM FIELD OPERATIONAL TESTS A PLATFORM FOR SHARING DATA FROM FIELD OPERATIONAL TESTS Yvonne Barnard ERTICO ITS Europe Avenue Louise 326 B-1050 Brussels, Belgium y.barnard@mail.ertico.com Sami Koskinen VTT Technical Research Centre

More information

13 th EC GI & GIS Workshop WIN: A new OGC compliant SOA. for risk management. GMV, 2007 Property of GMV All rights reserved

13 th EC GI & GIS Workshop WIN: A new OGC compliant SOA. for risk management. GMV, 2007 Property of GMV All rights reserved 13 th EC GI & GIS Workshop WIN: A new OGC compliant SOA for risk management GMV, 2007 Property of GMV All rights reserved Content 1. Introduction 2. Objectives 3. Architecture and Model 4. Technical aspects

More information

Open Access and Open Research Data in Horizon 2020

Open Access and Open Research Data in Horizon 2020 Open Access and Open Research Data in Horizon 2020 Celina Ramjoué Head of Sector Open Access to Scientific Publications and Data Digital Science Unit CONNECT.C3 22 November 2013 Train the Trainer for H2020

More information

HL7 AROUND THE WORLD

HL7 AROUND THE WORLD HL7 International HL7 AROUND THE WORLD Updated by the HL7 International Mentoring Committee, September 2014 Original version by Klaus Veil (2009) / Edited by Diego Kaminker IMC HL7 Around the World 1 What

More information

General concepts: DDI

General concepts: DDI General concepts: DDI Irena Vipavc Brvar, ADP SEEDS Kick-off meeting, Lausanne, 4. - 6. May 2015 How to describe our survey What we learned so far: If we want to use data at some point in the future,

More information