Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

Size: px
Start display at page:

Download "Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences"

Transcription

1 Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences WP11 Data Storage and Analysis Task 11.1 Coordination Deliverable 11.3 Selected Standards and Architecture for Data Storage and Access Services supporting Euro-BioImaging Task leaders UNIVDUN Additional Task Contributors: October

2 1. Report Summary WP11 Objectives To define a roadmap towards the construction of a European Biomedical Imaging Data Storage and Analysis Infrastructure. The key objectives of this infrastructure will be: to support efficient and standardized storage for and access to curated biomedical image data. to support open- source software for biomedical image analysis through coordination of community efforts, provision of an actively maintained repository of state- of- the- art validated algorithms for quantitative image analysis and thorough training. to interface with high performance computing facilities for high- throughput and/or computation- intensive image analysis. to provide seamless collaboration and access to other relevant computing and data resources in ESFRI and in European and national infrastructures. Digital imaging is now routinely used across both the life and biomedical sciences and has become an essential tool for all aspects of research, training and clinical practice. As image data volumes and complexity grow, it becomes increasingly difficult to access, view, share and analyse datasets using standard desktop- based solutions. In addition, the need for interdisciplinary collaboration, where datasets are shared with consortia of scientists with expertise in experimental biology, data analysis, image processing, modelling, physiology and/or medicine has grown as well. In this new age, processing, analysis and sharing of image data based on conventional desktop- based solutions simply is no longer possible. Multi- dimensional images from unique clinical cohorts must be securely shared among defined collaborations to enable full analysis and query, while ensuring that identifiable data are never made publicly available. In biological imaging, technologies such as multi- dimensional fluorescence or high- content screening are becoming standard approaches to reveal fundamental biological mechanisms that explain human physiology and disease. Datasets produced by these technologies are routinely 10 s to 100 s of GBs, and in some cases, many TB s. Examples of these datasets are shown in Table 1, which shows reported dataset sizes from Euro- BioImaging WP6 and WP7 PCS Users. To deliver the potential of these data, and the ambition and potential of European science, the technology that enables access to image data regardless of where it is produced has become a critical scientific need, and one which the Euro- BioImaging infrastructure must address - 2 -

3 Table 1. Data volumes recorded by Euro- BioImaging Users during the WP6 and WP7 Proof of Concept Studies. Data recorded using different technologies are shown User ID Data Size (GB) Technology. 3 1 STED- microscopy 22 1 High- throughput microscopy (Assay development) 2 2 STED- microscopy 4 2 STED- microscopy, CLSM 9 5 FRAP, FRET, PLA and IF 1 8 AFM, STED- microscopy, CLSM 5 16 STED- microscopy, STED- deconvolution 6 16 STED- microscopy Laser Nanosurgery on confocal microscope Functional Imaging of Living Cells, FRAP + FRET High Content Screening and Confocal Microscopy Laser Nanosurgery on confocal microscope Spectral imaging Laser scanning and spinning disc microscopy LSM, Microinjection, Electron microscopy OMX 3DSIM CLEM and electron tomography CLSM, 2P LSCM, FRAP, photoactivation High- throughput microscopy High- throughput microscopy High- throughput microscopy High- throughput microscopy In this deliverable we define the requirements for Euro- BioImaging image database systems, and in particular consider the importance of standardised software interfaces to access and process data in these systems. We propose the creation of a central Euro- BioImaging Image Data Repository (EB- IDR) for scientific image data, to be used as the community resource for access, mining, standards, benchmarking and publication of image data. 2. Image databases and repositories 2.1 Data repositories at Euro- BioImaging Nodes All data recorded at Euro- BioImaging Nodes will be initially stored at the Node. Depending on the type of data collected and its future use, the responsibility for storing and providing access to - 3 -

4 the data may or may not shift to the User, the initial Euro- BioImaging data policy is data belongs to the User. For example, with current (late 2013) network and portable storage capacities, datasets up to 100GB can be transferred using various on- line file transfer protocols and data between 100 GB and 1TB can be transferred by portable media. Datasets larger than a few TB s are however not currently portable on- line or physically and thus in most cases will have to remain at the Node until analysis is completed and the user transfers them to a local archive at its home institution, or decides to submit it to the EB- IDR. Besides the sheer volume of the data, expertise, tools, and/or resources to fully process and analyse the data may not be easily transferred from Node to User. In each of these cases, image databases that store and enable sharing, processing, access and where necessary data transfer will be a critical part of the operation plan for each Node. OMERO, CellBase, Bisque and OSIRIX (described in detail in D11.2) are examples of well- developed open source image database solutions that Euro- BioImaging Nodes can choose to use for these functions. They are all under continuing development, but should currently provide the capabilities required by those Euro- BioImaging Nodes that collect large datasets. Euro- BioImaging Nodes and Users will have tools at their disposal to access and process the data they generate. Given the range of image data included in Euro- BioImaging, it is unlikely that Euro- BioImaging can specify a single standardised solution for managing access to data at all Nodes. However, given the advanced state of the applications described in D11.2, Euro- BioImaging must ensure that Nodes have sufficient capabilities to deploy and if at all possible contribute to the development of these applications. By ensuring the deployment and use of such applications, Euro- BioImaging Nodes can help drive their development, to the benefit of the whole Euro- BioImaging community. An example of this type of usage is shown in Figure 1. Datasets recorded as part of the Euro- BioImaging WP6 Proof- of- Concept Study using the University of Dundee 3DSIM OMX microscope have been loaded into an OMERO server. Images are accessible remotely to the PCS Users using OMERO clients (see Data is uploaded by the facility manager into accounts owned by each User

5 Euro-BioImaging Figure 1. Data collected as part of the WP6 PCS at the University of Dundee loaded into an OMERO system, accessible by PCS Users who collected the data. 2.2 Public Image data Repositories The Euro- BioImaging community will have several specific needs for remote on- line access to image data. For this purpose, a Euro- BioImaging Image Data Repository (EB- IDR) will be required to deliver several functions: Benchmark Datasets. Alongside the Euro- BioImaging web portal that describes resources for image processing (see D11.6), test and/or benchmark datasets will be required to validate analysis tools, and compare their performance on different data types. These data may comprise datasets acquired at Euro- BioImaging Nodes or others submitted by the community. Regardless, these datasets, which will amount to several TBs, must be housed in an on- line repository, annotated, searchable, licensed, and available for community use and flanked by cloud compute services that can host user algorithms for the benchmarking activity. Once deployed, these datasets may accessed by simple download, or alternatively, for large multi- TB, linked to fixed or cloud- based compute resources for processing (see D11.2). Datasets for Collaborative Research. Most data collected at Euro- BioImaging Nodes will be related to specific experiments, and therefore should be stored temporarily at the Euro- BioImaging Node for quality control and analysis and then archived locally via appropriate mechanisms at the -5-

6 user s home institution in line with funders requirements. Some portion of these data may be published alongside the paper reporting the results, but in general, these data will remain with the Euro- BioImaging User. However, a smaller proportion of datasets are critical community resources. They may be the foundation for national or trans- national research collaborations or reference data linked to other resources (e.g., biological sequence, expression or other on- line databases). These datasets, that we call reference images consist of the images themselves, plus experimental and/or analytic metadata. The definition of these datasets is somewhat fluid, but some examples are: 1. Cellular Phenotypes 2. Cellular structure data, derived from super- resolution light microscopy and 3D electron microscopy) 3. Cellular Atlases 4. Model Organism Atlases 5. Human Atlases Given the scientific value, technical sophistication or sheer expense associated with creating reference image data, public access to these data must be enabled. Similar approaches and collaborative datasets exist in the area of pathology image data acquired from biobank patient samples or mouse clinic image data of animal disease models. Euro- BioImaging is collaborating with the ESFRI infrastructures Infrafrontier and BBMRI that operate in these domains to ensure interoperability and compatible standards of these datasets in the framework of the BioMedBridges biomedical sciences research infrastructure cluster project. 2.3 Scale of the Euro- BioImaging Data Resource (EB- IDR) When proposing a resource like the EB- IDR, careful planning of scale and scope is critical. Fortunately, Euro- BioImaging s Community Survey, the very successful PCS programme, and the results from the first Open Call for Expression of Interest (EoI) for Euro- BioImaging Nodes together provide substantial real- world experience that provide a sound basis to construct the future image data infrastructure. These data allow an estimate of number of Euro- BioImaging Nodes, the number of Users and the size of the datasets they collect. The results of these projections are shown in Table 2. Depending on the assumptions made, the actual projected values range by only about ~2x. Also, we estimate that 15% - 35% of all data generated by normal Euro- BioImaging Nodes and by big Euro- BioImaging Nodes that offer flagship technologies with already established data standards, e.g. in CLEM and high throughput microscopy, will be deposited in the EB- IDR, a fraction likely to increase as data and annotation - 6 -

7 standards will spread from initial reference data sets to more image based science domains. Furthermore, it is expected that data produced at the Nodes would be temporarily mounted to allow remote user access and analysis (ca. 50% of reference dataset storage capacity). The key conclusion is that the sizes of datasets are certainly large, and likely to grow rapidly in the future. This puts image data at the same scale as biomolecular data, i.e. genome sequences, and therefore similar in scale to other data infrastructures and repositories (e.g., ELIXIR and EBI). Table 2. Scaled Data Storage Model for Euro- BioImaging. Estimated projections for size of EB- IDR, in TBs stored per year. Number of Nodes, Users, and Dataset sizes based on response from PCS providers and Euro- BioImaging Node EoI applications. Years 1-2 (15 Nodes) Years 3-4 (25 Nodes) Years 5-6 (40 Nodes) Users/Yr 525* 875* 1500* Users/Node/Yr Nodes Data Size Split (Fraction of Normal ) No. Normal Nodes (Dataset = 1 TB/user, scale over time) No. Big Nodes (Dataset = 10TB/user, scale over time) Total Data Normal Node/Yr ,5 (TB) Total Data Big Node/Yr (TB) Total Data all Users/all Nodes/Yr (TB) Steady state cloud provision at EB- IDR (ca. 50% of reference dataset storage capacity)*** Storage capacity for reference dataset, 221 ( ) = 1030 ( ) = 3445 published data, data for benchmarking etc. at EB- IDR** * Assumes 1.3 Users/two weeks, excluding holiday, downtime, etc. ** Assume ca. 15%/25%/35% of total data recorded at Nodes is published/committed to IDR. *** The steady- state cloud provision will need to be reviewed in the future, and the actual implementation will depend on emerging technologies and capabilities. 4. Conclusion The main requirement of the Euro- BioImaging Data Infrastructure will be the provision of image databases that store and enable sharing, processing, access and where necessary transfer of data. It is expected that Euro- BioImaging Nodes will generate datasets ranging in size from TB on average. Thus delivering this data to Users will require more sophisticated methods than USB sticks or writing DVDs. Several applications are now available that enable enterprise- scale data - 7 -

8 management and remote access, and Euro- BioImaging Nodes will need to invest in the hardware infrastructure and expert staff required to deliver these capabilities to their Users. A Euro- BioImaging Image Data Repository can hold and serve data linked to publications and also reference datasets, to promote large- scale collaborations, benchmarking and data mining. The Euro- BioImaging Survey and PCS provide solid evidence that define the scale and capabilities of such a resource

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences WP11 Data Storage and Analysis Task 11.1 Coordination Deliverable 11.2 Community Needs of

More information

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences WP11 Data Storage and Analysis Task 11.2 Coordination Deliverable 11.6 Standards and Architecture

More information

A pretty picture, or a measurement? Retinal Imaging

A pretty picture, or a measurement? Retinal Imaging Big Data Challenges A pretty picture, or a measurement? Organelles Dynamics Cells Retinal Imaging Physiology Pathology Fundus Camera Optical coherence tomography Fluorescence Histology High Content Screening

More information

DMBI: Data Management for Bio-Imaging.

DMBI: Data Management for Bio-Imaging. DMBI: Data Management for Bio-Imaging. Questionnaire Data Report 1.0 Overview As part of the DMBI project an international meeting was organized around the project to bring together the bio-imaging community

More information

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences. WP11 Data Storage and Analysis

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences. WP11 Data Storage and Analysis Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences WP11 Data Storage and Analysis Task 11.1 The data storage, curation and access challenge:

More information

EMBL Identity & Access Management

EMBL Identity & Access Management EMBL Identity & Access Management Rupert Lück EMBL Heidelberg e IRG Workshop Zürich Apr 24th 2008 Outline EMBL Overview Identity & Access Management for EMBL IT Requirements & Strategy Project Goal and

More information

Case Study Life Sciences Data

Case Study Life Sciences Data Case Study Life Sciences Data Centre for Integrative Systems Biology and Bioinformatics www.imperial.ac.uk/bioinfsupport Sarah Butcher s.butcher@imperial.ac.uk www.imperial.ac.uk/bioinfsupport Bio-data

More information

BREAK-OUT SESSION II : TRAINING WP13. Stakeholders Meeting Heidelberg, 31 st Jan 2012

BREAK-OUT SESSION II : TRAINING WP13. Stakeholders Meeting Heidelberg, 31 st Jan 2012 BREAK-OUT SESSION II : TRAINING WP13 Chairs: Pavel Hozak & Alejandro Frangi Stakeholders Meeting Heidelberg, 31 st Jan 2012 Welcome to WP13 Training discussion! Meeting Agenda 09.00 09.10 Word of welcome

More information

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova Using the Grid for the interactive workflow management in biomedicine Andrea Schenone BIOLAB DIST University of Genova overview background requirements solution case study results background A multilevel

More information

Thermo Scientific ArrayScan XTI High Content Analysis Reader. revolutionizing cell biology with the power of high content

Thermo Scientific ArrayScan XTI High Content Analysis Reader. revolutionizing cell biology with the power of high content Thermo Scientific ArrayScan XTI High Content Analysis Reader revolutionizing cell biology with the power of high content learn more about your cells using high content technology Thermo Scientific High

More information

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS 1. The Technology Strategy sets out six areas where technological developments are required to push the frontiers of knowledge

More information

Big Data in BioMedical Sciences. Steven Newhouse, Head of Technical Services, EMBL-EBI

Big Data in BioMedical Sciences. Steven Newhouse, Head of Technical Services, EMBL-EBI Big Data in BioMedical Sciences Steven Newhouse, Head of Technical Services, EMBL-EBI Big Data for BioMedical Sciences EMBL-EBI: What we do and why? Challenges & Opportunities Infrastructure Requirements

More information

Big Data in BioMedical Sciences. Steven Newhouse, Head of Technical Services, EMBL-EBI

Big Data in BioMedical Sciences. Steven Newhouse, Head of Technical Services, EMBL-EBI Big Data in BioMedical Sciences Steven Newhouse, Head of Technical Services, EMBL-EBI Big Data for BioMedical Sciences EMBL-EBI: What we do and why? Challenges & Opportunities Infrastructure Requirements

More information

IO Informatics The Sentient Suite

IO Informatics The Sentient Suite IO Informatics The Sentient Suite Our software, The Sentient Suite, allows a user to assemble, view, analyze and search very disparate information in a common environment. The disparate data can be numeric

More information

Workprogramme 2014-15

Workprogramme 2014-15 Workprogramme 2014-15 e-infrastructures DCH-RP final conference 22 September 2014 Wim Jansen einfrastructure DG CONNECT European Commission DEVELOPMENT AND DEPLOYMENT OF E-INFRASTRUCTURES AND SERVICES

More information

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences WP1 Project Management Task 1.3 Delivery of milestones; reporting duties Deliverable 1.1

More information

Keystones for supporting collaborative research using multiple data sets in the medical and bio-sciences

Keystones for supporting collaborative research using multiple data sets in the medical and bio-sciences Keystones for supporting collaborative research using multiple data sets in the medical and bio-sciences David Fergusson Head of Scientific Computing The Francis Crick Institute The Francis Crick Institute

More information

Horizon 2020. Research e-infrastructures Excellence in Science Work Programme 2016-17. Wim Jansen. DG CONNECT European Commission

Horizon 2020. Research e-infrastructures Excellence in Science Work Programme 2016-17. Wim Jansen. DG CONNECT European Commission Horizon 2020 Research e-infrastructures Excellence in Science Work Programme 2016-17 Wim Jansen DG CONNECT European Commission 1 Before we start The material here presented has been compiled with great

More information

Solution for private cloud computing

Solution for private cloud computing The CC1 system Solution for private cloud computing 1 Outline What is CC1? Features Technical details Use cases By scientist By HEP experiment System requirements and installation How to get it? 2 What

More information

IT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez

IT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez IT of SPIM Data Storage and Compression EMBO Course - August 27th Jeff Oegema, Peter Steinbach, Oscar Gonzalez 1 Talk Outline Introduction and the IT Team SPIM Data Flow Capture, Compression, and the Data

More information

Checklist for a Data Management Plan draft

Checklist for a Data Management Plan draft Checklist for a Data Management Plan draft The Consortium Partners involved in data creation and analysis are kindly asked to fill out the form in order to provide information for each datasets that will

More information

Introduction to Research Data Management

Introduction to Research Data Management Introduction to Research Data Management Marta Teperek, Veronica Phillips 30/10/2015 University of Cambridge TODAY: Mixture of activities and talking Introduction 1. Backup and exchange strategies 2. How

More information

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences WP9 Access to Innovative Technologies Medical Imaging Task 9.2: Organization of a European

More information

ZEISS Microscopy Course Catalog

ZEISS Microscopy Course Catalog ZEISS Microscopy Course Catalog ZEISS Training and Education Expand Your Possibilities Practical microscopy training has a long tradition at ZEISS. The first courses were held in Jena as early as 1907,

More information

Task Scheduling in Hadoop

Task Scheduling in Hadoop Task Scheduling in Hadoop Sagar Mamdapure Munira Ginwala Neha Papat SAE,Kondhwa SAE,Kondhwa SAE,Kondhwa Abstract Hadoop is widely used for storing large datasets and processing them efficiently under distributed

More information

A Novel Cloud Based Elastic Framework for Big Data Preprocessing

A Novel Cloud Based Elastic Framework for Big Data Preprocessing School of Systems Engineering A Novel Cloud Based Elastic Framework for Big Data Preprocessing Omer Dawelbeit and Rachel McCrindle October 21, 2014 University of Reading 2008 www.reading.ac.uk Overview

More information

NetApp Big Content Solutions: Agile Infrastructure for Big Data

NetApp Big Content Solutions: Agile Infrastructure for Big Data White Paper NetApp Big Content Solutions: Agile Infrastructure for Big Data Ingo Fuchs, NetApp April 2012 WP-7161 Executive Summary Enterprises are entering a new era of scale, in which the amount of data

More information

Web Application Hosting Cloud Architecture

Web Application Hosting Cloud Architecture Web Application Hosting Cloud Architecture Executive Overview This paper describes vendor neutral best practices for hosting web applications using cloud computing. The architectural elements described

More information

6 ELIXIR Domain Specific Services

6 ELIXIR Domain Specific Services 6 ELIXIR Domain Specific Services Work stream leads: Alfonso Valencia (ES), Inge Jonassen (NO), Jose Leal (PT) Work stream members: Nils-Peder Willassen (NO), Finn Drablos (NO), Mark Viant (UK), Ferran

More information

#jenkinsconf. Jenkins as a Scientific Data and Image Processing Platform. Jenkins User Conference Boston #jenkinsconf

#jenkinsconf. Jenkins as a Scientific Data and Image Processing Platform. Jenkins User Conference Boston #jenkinsconf Jenkins as a Scientific Data and Image Processing Platform Ioannis K. Moutsatsos, Ph.D., M.SE. Novartis Institutes for Biomedical Research www.novartis.com June 18, 2014 #jenkinsconf Life Sciences are

More information

NIH Commons Overview, Framework & Pilots - Version 1. The NIH Commons

NIH Commons Overview, Framework & Pilots - Version 1. The NIH Commons The NIH Commons Summary The Commons is a shared virtual space where scientists can work with the digital objects of biomedical research, i.e. it is a system that will allow investigators to find, manage,

More information

Deliverable D1.1. Building data bridges between biological and medical infrastructures in Europe. Grant agreement no.: 284209

Deliverable D1.1. Building data bridges between biological and medical infrastructures in Europe. Grant agreement no.: 284209 Deliverable D1.1 Project Title: Building data bridges between biological and medical infrastructures in Europe Project Acronym: BioMedBridges Grant agreement no.: 284209 Research Infrastructures, FP7 Capacities

More information

Check Your Data Freedom: A Taxonomy to Assess Life Science Database Openness

Check Your Data Freedom: A Taxonomy to Assess Life Science Database Openness Check Your Data Freedom: A Taxonomy to Assess Life Science Database Openness Melanie Dulong de Rosnay Fellow, Science Commons and Berkman Center for Internet & Society at Harvard University This article

More information

Enabling Collaboration Using the Biomedical Informatics Research Network (BIRN):

Enabling Collaboration Using the Biomedical Informatics Research Network (BIRN): Enabling Collaboration Using the Biomedical Informatics Research Network (BIRN): Karl Helmer Ph.D. Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital June 4, 2010 BIRN

More information

Exploitation of ISS scientific data

Exploitation of ISS scientific data Cooperative ISS Research data Conservation and Exploitation Exploitation of ISS scientific data Luigi Carotenuto Telespazio s.p.a. Copernicus Big Data Workshop March 13-14 2014 European Commission Brussels

More information

A grant number provides unique identification for the grant.

A grant number provides unique identification for the grant. Data Management Plan template Name of student/researcher(s) Name of group/project Description of your research Briefly summarise the type of your research to help others understand the purposes for which

More information

The Trials and Tribulations and ultimate success of parallelisation using Hadoop within the SCAPE project

The Trials and Tribulations and ultimate success of parallelisation using Hadoop within the SCAPE project The Trials and Tribulations and ultimate success of parallelisation using Hadoop within the SCAPE project Alastair Duncan STFC Pre Coffee talk STFC July 2014 SCAPE Scalable Preservation Environments The

More information

Research IT Application Development

Research IT Application Development Research IT Clinical IS OR Services Offered to Research Faculty & Staff: - Application development - Data storage servers - Epic Data extraction (I2B2) - Licensed Software for Researchers - General technology

More information

Open & Big Data for Life Imaging Technical aspects : existing solutions, main difficulties. Pierre Mouillard MD

Open & Big Data for Life Imaging Technical aspects : existing solutions, main difficulties. Pierre Mouillard MD Open & Big Data for Life Imaging Technical aspects : existing solutions, main difficulties Pierre Mouillard MD What is Big Data? lots of data more than you can process using common database software and

More information

Managing and Conducting Biomedical Research on the Cloud Prasad Patil

Managing and Conducting Biomedical Research on the Cloud Prasad Patil Managing and Conducting Biomedical Research on the Cloud Prasad Patil Laboratory for Personalized Medicine Center for Biomedical Informatics Harvard Medical School SaaS & PaaS gmail google docs app engine

More information

UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure

UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure Authors: A O Jaunsen, G S Dahiya, H A Eide, E Midttun Date: Dec 15, 2015 Summary Uninett Sigma2 provides High

More information

Integrated Rule-based Data Management System for Genome Sequencing Data

Integrated Rule-based Data Management System for Genome Sequencing Data Integrated Rule-based Data Management System for Genome Sequencing Data A Research Data Management (RDM) Green Shoots Pilots Project Report by Michael Mueller, Simon Burbidge, Steven Lawlor and Jorge Ferrer

More information

IBM Software Information Management Creating an Integrated, Optimized, and Secure Enterprise Data Platform:

IBM Software Information Management Creating an Integrated, Optimized, and Secure Enterprise Data Platform: Creating an Integrated, Optimized, and Secure Enterprise Data Platform: IBM PureData System for Transactions with SafeNet s ProtectDB and DataSecure Table of contents 1. Data, Data, Everywhere... 3 2.

More information

Neelesh Kamkolkar, Product Manager. A Guide to Scaling Tableau Server for Self-Service Analytics

Neelesh Kamkolkar, Product Manager. A Guide to Scaling Tableau Server for Self-Service Analytics Neelesh Kamkolkar, Product Manager A Guide to Scaling Tableau Server for Self-Service Analytics 2 Many Tableau customers choose to deliver self-service analytics to their entire organization. They strategically

More information

DataNet Flexible Metadata Overlay over File Resources

DataNet Flexible Metadata Overlay over File Resources 1 DataNet Flexible Metadata Overlay over File Resources Daniel Harężlak 1, Marek Kasztelnik 1, Maciej Pawlik 1, Bartosz Wilk 1, Marian Bubak 1,2 1 ACC Cyfronet AGH, 2 AGH University of Science and Technology,

More information

Selecting the Right NAS File Server

Selecting the Right NAS File Server Selecting the Right NAS File Server As the network administrator for a workgroup LAN, consider this scenario: once again, one of your network file servers is running out of storage space. You send out

More information

High-Performance, Low-Cost Computational Chemistry: Servers in a Stick, Box, and Cloud. Nathan Vance Polik Group Hope College February 19, 2015

High-Performance, Low-Cost Computational Chemistry: Servers in a Stick, Box, and Cloud. Nathan Vance Polik Group Hope College February 19, 2015 High-Performance, Low-Cost Computational Chemistry: Servers in a Stick, Box, and Cloud Nathan Vance Polik Group Hope College February 19, 2015 Outline The use and history of computing in chemistry The

More information

Service Road Map for ANDS Core Infrastructure and Applications Programs

Service Road Map for ANDS Core Infrastructure and Applications Programs Service Road Map for ANDS Core and Applications Programs Version 1.0 public exposure draft 31-March 2010 Document Target Audience This is a high level reference guide designed to communicate to ANDS external

More information

IMARIS. 3D and 4D interactive analysis and visualization solutions for the life sciences.

IMARIS. 3D and 4D interactive analysis and visualization solutions for the life sciences. IMARIS 3D and 4D interactive analysis and visualization solutions for the life sciences. IMARIS A Brief History For over 20 years Bitplane has offered enabling scientific software tools for the life science

More information

IT Coordination Group and ECRIN Data Centers

IT Coordination Group and ECRIN Data Centers IT Coordination Group and ECRIN Data Centers Venizeleas D, Ohmann Ch April 16, 2007 DRAFT Version April 16, 2007 1 Contents Contents 2 1 Introduction 3 2 IT Platform 3 3 Platform Organization 4 3.1 IT

More information

Cloud Based Distributed Databases: The Future Ahead

Cloud Based Distributed Databases: The Future Ahead Cloud Based Distributed Databases: The Future Ahead Arpita Mathur Mridul Mathur Pallavi Upadhyay Abstract Fault tolerant systems are necessary to be there for distributed databases for data centers or

More information

Enforce AD RMS Policies for PDF documents in SharePoint Environments... 5. Enforce AD RMS Policies for PDF documents in Exchange Environments...

Enforce AD RMS Policies for PDF documents in SharePoint Environments... 5. Enforce AD RMS Policies for PDF documents in Exchange Environments... 1 Contents Introduction... 4 Foxit PDF Security Suite Environments... 5 Enforce AD RMS Policies for PDF documents in SharePoint Environments... 5 Enforce AD RMS Policies for PDF documents in Exchange Environments...

More information

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of

More information

PARALLELS CLOUD STORAGE

PARALLELS CLOUD STORAGE PARALLELS CLOUD STORAGE Performance Benchmark Results 1 Table of Contents Executive Summary... Error! Bookmark not defined. Architecture Overview... 3 Key Features... 5 No Special Hardware Requirements...

More information

Enabling Science in the Cloud: A Remote Sensing Data Processing Service for Environmental Science Analysis

Enabling Science in the Cloud: A Remote Sensing Data Processing Service for Environmental Science Analysis Enabling Science in the Cloud: A Remote Sensing Data Processing Service for Environmental Science Analysis Catharine van Ingen 1, Jie Li 2, Youngryel Ryu 3, Marty Humphrey 2, Deb Agarwal 4, Keith Jackson

More information

European Molecular Biology Laboratory Case Example

European Molecular Biology Laboratory Case Example European Molecular Biology Laboratory Case Example Dr. Silke Schumacher Director International Relations EMBL Member States Austria 1974 Denmark 1974 France 1974 Germany 1974 Israel 1974 Italy 1974 Netherlands

More information

Three data delivery cases for EMBL- EBI s Embassy. Guy Cochrane www.ebi.ac.uk

Three data delivery cases for EMBL- EBI s Embassy. Guy Cochrane www.ebi.ac.uk Three data delivery cases for EMBL- EBI s Embassy Guy Cochrane www.ebi.ac.uk EMBL European Bioinformatics Institute Genes, genomes & variation European Nucleotide Archive 1000 Genomes Ensembl Ensembl Genomes

More information

The big data revolution

The big data revolution The big data revolution Friso van Vollenhoven (Xebia) Enterprise NoSQL Recently, there has been a lot of buzz about the NoSQL movement, a collection of related technologies mostly concerned with storing

More information

Clinical Research Infrastructure

Clinical Research Infrastructure Clinical Research Infrastructure Enhancing UK s Clinical Research Capabilities & Technologies At least 150m to establish /develop cutting-edge technological infrastructure, UK wide. to bring into practice

More information

Teaching Computational Thinking using Cloud Computing: By A/P Tan Tin Wee

Teaching Computational Thinking using Cloud Computing: By A/P Tan Tin Wee Teaching Computational Thinking using Cloud Computing: By A/P Tan Tin Wee Technology in Pedagogy, No. 8, April 2012 Written by Kiruthika Ragupathi (kiruthika@nus.edu.sg) Computational thinking is an emerging

More information

Work Package 13.5: Authors: Paul Flicek and Ilkka Lappalainen. 1. Introduction

Work Package 13.5: Authors: Paul Flicek and Ilkka Lappalainen. 1. Introduction Work Package 13.5: Report summarising the technical feasibility of the European Genotype Archive to collect, store, and use genotype data stored in European biobanks in a manner that complies with all

More information

Usage guidelines for the Advanced Light Microscopy Technology Platform (ALM) at the Max-Delbrück-Center, Berlin

Usage guidelines for the Advanced Light Microscopy Technology Platform (ALM) at the Max-Delbrück-Center, Berlin Usage guidelines for the Advanced Light Microscopy Technology Platform (ALM) at the Max-Delbrück-Center, Berlin The following guidelines apply to all internal and external user of the ALM. The purpose

More information

Make the Most of Big Data to Drive Innovation Through Reseach

Make the Most of Big Data to Drive Innovation Through Reseach White Paper Make the Most of Big Data to Drive Innovation Through Reseach Bob Burwell, NetApp November 2012 WP-7172 Abstract Monumental data growth is a fact of life in research universities. The ability

More information

ParaVision 6. Innovation with Integrity. The Next Generation of MR Acquisition and Processing for Preclinical and Material Research.

ParaVision 6. Innovation with Integrity. The Next Generation of MR Acquisition and Processing for Preclinical and Material Research. ParaVision 6 The Next Generation of MR Acquisition and Processing for Preclinical and Material Research Innovation with Integrity Preclinical MRI A new standard in Preclinical Imaging ParaVision sets a

More information

CPIx - IT ASSESSMENT FORM

CPIx - IT ASSESSMENT FORM CPIx - IT ASSESSMENT FORM Part 1 - General Information and Company Policies on Information Exchange Part 1 of this questionnaire covers general policy issues on the exchange of electronic information within

More information

ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE. October 2013

ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE. October 2013 ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE October 2013 Introduction As sequencing technologies continue to evolve and genomic data makes its way into clinical use and

More information

Local Loading. The OCUL, Scholars Portal, and Publisher Relationship

Local Loading. The OCUL, Scholars Portal, and Publisher Relationship Local Loading Scholars)Portal)has)successfully)maintained)relationships)with)publishers)for)over)a)decade)and)continues) to)attract)new)publishers)that)recognize)both)the)competitive)advantage)of)perpetual)access)through)

More information

DAME Astrophysical DAta Mining Mining & & Exploration Exploration GRID

DAME Astrophysical DAta Mining Mining & & Exploration Exploration GRID DAME Astrophysical DAta Mining & Exploration on GRID M. Brescia S. G. Djorgovski G. Longo & DAME Working Group Istituto Nazionale di Astrofisica Astronomical Observatory of Capodimonte, Napoli Department

More information

International Workshop on Big Data Analytics for Advanced Databases (BIGDATA, 2016)

International Workshop on Big Data Analytics for Advanced Databases (BIGDATA, 2016) International Workshop on Big Data Analytics for Advanced Databases (BIGDATA, 2016) Call for Papers AIM and SCOPE There is an exponential growth in digital data with unprecedented new platforms derived

More information

ebook Utilizing MapReduce to address Big Data Enterprise Needs Leveraging Big Data to shorten drug development cycles in Pharmaceutical industry.

ebook Utilizing MapReduce to address Big Data Enterprise Needs Leveraging Big Data to shorten drug development cycles in Pharmaceutical industry. Utilizing MapReduce to address Big Data Enterprise Needs Leveraging Big Data to shorten drug development cycles in Pharmaceutical industry. www.persistent.com 3 4 5 5 7 9 10 11 12 13 From the Vantage Point

More information

Towards the construction of an integrated Wheat Information System

Towards the construction of an integrated Wheat Information System Towards the construction of an integrated Wheat Information System Mario Caccamo 1, Hadi Quesneville 2 Report- June 2012 1. The Genome Analysis Centre (TGAC), Norwich Research Park, Norwich, UK 2. INRA,

More information

Big Data and Cloud Computing for GHRSST

Big Data and Cloud Computing for GHRSST Big Data and Cloud Computing for GHRSST Jean-Francois Piollé (jfpiolle@ifremer.fr) Frédéric Paul, Olivier Archer CERSAT / Institut Français de Recherche pour l Exploitation de la Mer Facing data deluge

More information

Early Cloud Experiences with the Kepler Scientific Workflow System

Early Cloud Experiences with the Kepler Scientific Workflow System Available online at www.sciencedirect.com Procedia Computer Science 9 (2012 ) 1630 1634 International Conference on Computational Science, ICCS 2012 Early Cloud Experiences with the Kepler Scientific Workflow

More information

The Extension of the DICOM Standard to Incorporate Omics

The Extension of the DICOM Standard to Incorporate Omics Imperial College London The Extension of the DICOM Standard to Incorporate Omics Data Richard I Kitney, Vincent Rouilly and Chueh-Loo Poh Department of Bioengineering We stand at the dawn of a new understanding

More information

Clinical Research Infrastructure at the European level: the ECRIN model. Christine Kubiak ECRIN Coordination Inserm- DRCT

Clinical Research Infrastructure at the European level: the ECRIN model. Christine Kubiak ECRIN Coordination Inserm- DRCT Clinical Research Infrastructure at the European level: the RIN model Christine Kubiak RIN Coordination Inserm- DRCT Finland FinnCRIN Ireland ICRIN UK UKCRN France Inserm Denmark DCRIN Germany KKS Switzerland

More information

MediSapiens Ltd. Bio-IT solutions for improving cancer patient care. Because data is not knowledge. 19th of March 2015

MediSapiens Ltd. Bio-IT solutions for improving cancer patient care. Because data is not knowledge. 19th of March 2015 19th of March 2015 MediSapiens Ltd Because data is not knowledge Bio-IT solutions for improving cancer patient care Sami Kilpinen, Ph.D Co-founder, CEO MediSapiens Ltd Copyright 2015 MediSapiens Ltd. All

More information

Global Networking of Collections WFCC and GBRCN perspectives. EMbaRC Seminar David Smith Cantacuzino Institute, Bucharest, Romania 8-9 March 2010

Global Networking of Collections WFCC and GBRCN perspectives. EMbaRC Seminar David Smith Cantacuzino Institute, Bucharest, Romania 8-9 March 2010 Global Networking of Collections WFCC and GBRCN perspectives EMbaRC Seminar David Smith Cantacuzino Institute, Bucharest, Romania 8-9 March 2010 1 Summary Challenges need collaboration Networks The WFCC

More information

Human Brain Project -

Human Brain Project - Human Brain Project - Scientific goals, Organization, Our role Wissenswerte, Bremen 26. Nov 2013 Prof. Sonja Grün Insitute of Neuroscience and Medicine (INM-6) & Institute for Advanced Simulations (IAS-6)

More information

Sharing Data from Large-scale Biological Research Projects: A System of Tripartite Responsibility

Sharing Data from Large-scale Biological Research Projects: A System of Tripartite Responsibility Sharing Data from Large-scale Biological Research Projects: A System of Tripartite Responsibility Report of a meeting organized by the Wellcome Trust and held on 14 15 January 2003 at Fort Lauderdale,

More information

Powering Cutting Edge Research in Life Sciences with High Performance Computing

Powering Cutting Edge Research in Life Sciences with High Performance Computing A Point of View Powering Cutting Edge Research in Life Sciences with High Performance Computing High performance computing (HPC) is the foundation of pioneering research in life sciences. HPC plays a vital

More information

The Key Elements of Digital Asset Management

The Key Elements of Digital Asset Management The Key Elements of Digital Asset Management The last decade has seen an enormous growth in the amount of digital content, stored on both public and private computer systems. This content ranges from professionally

More information

June 2012. Advanced Data Reproduction and Storage System Data Center in a Box

June 2012. Advanced Data Reproduction and Storage System Data Center in a Box June 2012 Advanced Data Reproduction and Storage System Data Center in a Box Contents Introduction... 2 Steps for Successful Document Management... 2 Phase 0: Survey... 4 Phase 1: Architecture... 4 Phase

More information

M2M. Machine-to-Machine Intelligence Corporation. M2M Intelligence. Architecture Overview

M2M. Machine-to-Machine Intelligence Corporation. M2M Intelligence. Architecture Overview M2M Machine-to-Machine Intelligence Corporation M2M Intelligence Architecture Overview M2M Intelligence - Essential platform for the M2M and IoT Economy Architecture Overview Revised styles and edits 6/3/2016

More information

explore the cell... without limits

explore the cell... without limits Thermo Scientific ArrayScan XTI HCA Infinity Configuration explore the cell... without limits Innovative confocal imaging Live cell and label-free capability Modular, flexible design Industry-leading analysis

More information

Steven Newhouse, Head of Technical Services

Steven Newhouse, Head of Technical Services Challenges at EMBL-EBI Steven Newhouse, Head of Technical Services European Bioinformatics Institute Outstation of the European Molecular Biology Laboratory International organisation created by treaty

More information

A Service for Data-Intensive Computations on Virtual Clusters

A Service for Data-Intensive Computations on Virtual Clusters A Service for Data-Intensive Computations on Virtual Clusters Executing Preservation Strategies at Scale Rainer Schmidt, Christian Sadilek, and Ross King rainer.schmidt@arcs.ac.at Planets Project Permanent

More information

ELIXIR Scientific Programme 2014-2018

ELIXIR Scientific Programme 2014-2018 ELIXIR Scientific Programme 2014-2018 1 ELIXIR Scientific Programme 2014-2018 Contents About ELIXIR 1 Executive Summary 2 Europe s Bioinformatics Infrastructure: key challenges 2014-2018 4 ELIXIR s Strategic

More information

SINTERO SERVER. Simplifying interoperability for distributed collaborative health care

SINTERO SERVER. Simplifying interoperability for distributed collaborative health care SINTERO SERVER Simplifying interoperability for distributed collaborative health care Tim Benson, Ed Conley, Andrew Harrison, Ian Taylor COMSCI, Cardiff University What is Sintero? Sintero Server is a

More information

UNIVERSITY OF WROCŁAW

UNIVERSITY OF WROCŁAW BIO-IMAGing in research INnovation and Education UNIVERSITY OF WROCŁAW Faculty of Biotechnology Tamka 2, Przybyszewskiego 63/77, WROCŁAW, POLAND Legend I. Confocal system (LSM510 META - Zeiss) - Laboratory

More information

HETEROGENEOUS DATA INTEGRATION FOR CLINICAL DECISION SUPPORT SYSTEM. Aniket Bochare - aniketb1@umbc.edu. CMSC 601 - Presentation

HETEROGENEOUS DATA INTEGRATION FOR CLINICAL DECISION SUPPORT SYSTEM. Aniket Bochare - aniketb1@umbc.edu. CMSC 601 - Presentation HETEROGENEOUS DATA INTEGRATION FOR CLINICAL DECISION SUPPORT SYSTEM Aniket Bochare - aniketb1@umbc.edu CMSC 601 - Presentation Date-04/25/2011 AGENDA Introduction and Background Framework Heterogeneous

More information

IDENTIFYING AND OPTIMIZING DATA DUPLICATION BY EFFICIENT MEMORY ALLOCATION IN REPOSITORY BY SINGLE INSTANCE STORAGE

IDENTIFYING AND OPTIMIZING DATA DUPLICATION BY EFFICIENT MEMORY ALLOCATION IN REPOSITORY BY SINGLE INSTANCE STORAGE IDENTIFYING AND OPTIMIZING DATA DUPLICATION BY EFFICIENT MEMORY ALLOCATION IN REPOSITORY BY SINGLE INSTANCE STORAGE 1 M.PRADEEP RAJA, 2 R.C SANTHOSH KUMAR, 3 P.KIRUTHIGA, 4 V. LOGESHWARI 1,2,3 Student,

More information

Big answers from big data: Thomson Reuters research analytics

Big answers from big data: Thomson Reuters research analytics Big answers from big data: Thomson Reuters research analytics REUTERS/Stoyan Nenov Nordic Workshop on Bibliometrics and Research Policy Ann Beynon September 2014 Thomson Reuters: Solutions Portfolio to

More information

NATIONAL INSTITUTE OF ENVIRONMENTAL HEALTH SCIENCES Division of Extramural Research and Training

NATIONAL INSTITUTE OF ENVIRONMENTAL HEALTH SCIENCES Division of Extramural Research and Training NATIONAL INSTITUTE OF ENVIRONMENTAL HEALTH SCIENCES Division of Extramural Research and Training NATIONAL ADVISORY ENVIRONMENTAL HEALTH SCIENCES COUNCIL May 12-13, 2010 Concept Clearance Small Business

More information

A Strategy for Plant Breeding Data Management in International Agricultural Research

A Strategy for Plant Breeding Data Management in International Agricultural Research A Strategy for Plant Breeding Data Management in International Agricultural Research Introduction Exchange of germplasm boosted crop improvement for subsistence agriculture during the 70s and 80s, and

More information

RevoScaleR Speed and Scalability

RevoScaleR Speed and Scalability EXECUTIVE WHITE PAPER RevoScaleR Speed and Scalability By Lee Edlefsen Ph.D., Chief Scientist, Revolution Analytics Abstract RevoScaleR, the Big Data predictive analytics library included with Revolution

More information

Exploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing.

Exploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing. Exploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing. Dr Liz Lyon, UKOLN, University of Bath Introduction and Objectives UKOLN is undertaking

More information

Deploying Exchange Server 2007 SP1 on Windows Server 2008

Deploying Exchange Server 2007 SP1 on Windows Server 2008 Deploying Exchange Server 2007 SP1 on Windows Server 2008 Product Group - Enterprise Dell White Paper By Ananda Sankaran Andrew Bachler April 2008 Contents Introduction... 3 Deployment Considerations...

More information

Turnkey Deduplication Solution for the Enterprise

Turnkey Deduplication Solution for the Enterprise Symantec NetBackup 5000 Appliance Turnkey Deduplication Solution for the Enterprise Mayur Dewaikar Sr. Product Manager, Information Management Group White Paper: A Deduplication Appliance Solution for

More information

Linked Science as a producer and consumer of big data in the Earth Sciences

Linked Science as a producer and consumer of big data in the Earth Sciences Linked Science as a producer and consumer of big data in the Earth Sciences Line C. Pouchard,* Robert B. Cook,* Jim Green,* Natasha Noy,** Giri Palanisamy* Oak Ridge National Laboratory* Stanford Center

More information

Hybrid Development and Test USE CASE

Hybrid Development and Test USE CASE Hybrid Development and Test USE CASE CliQr Use Case: Hybrid Development and Test Page 2 Hybrid Development and Test Unlike the production phase, with its typically steady workload, development and test

More information