Bioinformatics and escience. Abstract
|
|
- Suzan Harrison
- 7 years ago
- Views:
Transcription
1 Bioinformatics and escience W.A. Gray 1 and C. Thompson 2 1 Cardiff University, School of Computer Science, PO Box 916, Cardiff CF24 3XF, 2 BBSRC, Polaris House, North Star Avenue, Swindon SN2 1UH Abstract This paper gives a brief overview of the diverse field of bioinformatics, identifying research themes. It then introduces the BBSRC funded escience pilot projects against this overview by presenting their biological aims. Areas of the escience programme addressed by these projects are identified to determine the contribution they are expected to make to escience. This covers contributions to the developing Grid middleware and associated escience standards as well as their bioinformatics goals. Acknowledgement The authors thank the staff and PIs of the six projects who contributed material on which this paper is based. These are: 1) (e-protein) A distributed pipeline for structural-based proteome annotation using GRID technology Prof MJE Sternberg (PI) Imperial College, University College London, European Bioinformatics Institute [ 2) (BioSimGRID) A GRID database for biomolecular simulations Prof MSP Sansom (PI) Oxford University, Southampton University, Birkbeck College, York University, Nottingham University, Birmingham University [ 3) (e-htpx) An escience resource for high throughput protein crystallography Dr C Nave (PI) CLRC Daresbury Laboratory, Cambridge University, Cardiff University, European Bioinformatics Institute, York University, Oxford University [ 4) (BDWorld) A problem solving environment for global biodiversity: prototype and demonstrator Prof FA Bisby (PI) Reading University, Cardiff University, Southampton University, Natural History Museum [ 5) GRID-enabled modelling tools and databases for neuroinformatics Dr N Goddard (PI) Edinburgh University {jointly funded with MRC} [ and 6) (BASIS) Biology of ageing escience integration and simulation system Prof TBL Kirkwood (PI) Newcastle University [ 1. Introduction In May the BBSRC held a meeting of its escience pilot projects. It was agreed at the meeting, that it would be a good idea to present a paper at this All Hands Meeting, which covered their six pilot projects presenting their bioinformatics goals and how they will contribute to the aims of the UK escience programme. The intention is to show the types of bioinformatics research enabled by an escience approach and how these projects will drive and contribute to the escience developments occurring in parallel in other research disciplines across the UK escience programme. Bioinformatics can be described as the derivation of knowledge from computer analysis of biological data. This is a simplistic view as it is a discipline which includes a wide range of scientific investigation. It is not a homogeneous domain. This means that there is a wide range of opinions and views as to what it
2 comprises. In its broadest sense it is the application of informatics techniques to biological data in a research or application environment. These informatics techniques can come from a number of disciplines including computer science, statistics and mathematics. This discipline list is by no means exhaustive as researchers utilise techniques from engineering, physics, and other scientific disciplines. In their Strategic Plan [1] the BBSRC recognise the growing importance of bioinformatics in their domain when they state: Genome sequencing and post-genomic technologies provide researchers with massive amounts of data. As a consequence biology is becoming more quantitative. Large experimental data sets will increasingly allow computer (in silico) simulation of biological systems. This recognises the growing importance of bioinformatics in the next generation of research in bioscience. The BBSRC identify in [1] the following areas as important to their research agenda in nonclinical bioscience: Integrative biology, Sustainable agriculture, The Healthy organism, and Bioscience for industry. It is recognised that there are different levels of biological structure within these areas from individual molecules, cells, tissues/organs through populations to microbes, plants and animals. There is a need to do research in a number of ways within these levels and across the levels. This research underpins the growth in bioinformatics due to its generation of large amounts of data, which need to be analysed, integrated and used in simulations to test new ideas. In the second strategic objective in [1], it is recognised that bioscience is increasingly dependent on the development and use of e- tools due to the data being collected at the omics level of research which should lead to more productive in silico research and the development of new bioinformatics tools. These tools will be needed in areas such as data mining, pattern recognition, model building, data sharing across the vertical and horizontal biological structure levels. Traditional bioscience research also needs access to the data collections made over the centuries by Institutes such as the Natural History Museum. These collections must be prepared in machine readable formats that allow investigations to be made that shed new light and understanding on biodiversity and the effects on it of changes in climate, agricultural policy and government policy. It is important that the UK research in bioinformatics links with other efforts at National and International levels eg the GBIF (Global Biodiversity Information Facility) initiative in biodiversity. 2. The Pilot Projects When the pilot project proposals were submitted the BBSRC had identified four theme areas for these projects: Genomics, Structural studies, Cellular processes, and Biodiversity. The successful projects covered topics within these areas, although there was no specific pilot project in the genomics area. The BASIS project at Newcastle University [2] is concerned with utilising emerging Grid middleware technology to develop a system supporting a research community investigating quantitatively the biology of ageing at the cell, tissue and organism levels. It is creating new tools such as SBML (Systems Biology Markup Language) which will help a researcher build a computer based model of ageing processes and test them. At the tissue level these will be based on fibroblasts, gut and brain with experiments being conducted to determine the effect of random cell death in a tissue. Users will be able to set up experiments involving the creation of new models or modification of existing models which let them gain a deeper understanding of the effects of tissue ageing on organisms. It is intended that this facility will be available to a distributed community of researchers who will share their results, models and analytic tools. Thus this project aims to create a facility which allows simulation by system modelling within a structure level and across levels. The development team is working closely with another local team who are developing Grid middleware in the OGSA DAI project for accessing data held in databases. Thus their
3 requirements are informing the development of this middleware and they are utilising it in the development of the system. e-protein aims to provide a structure-based annotation of proteins in major genomes, which can be disseminated to other researchers. It is intended that alternative annotation approaches will be investigated to identify improvements in the methods of annotation. This will be used to build local databases holding structural and functional annotation of sequence data, which can be linked with relevant bioinformatics data resources at other sites. The improvement of protein modelling is a prime aim of the project so that better function predictions can be made. They intend that the system will be available to a research community collaborating in their investigations so that they can investigate and test alternative structure models. This system will have a workflow based interface which makes use of the ICENI middleware and its rich metadata structure for describing software tools. This middleware is being developed in a related Grid project which involves several of the e- Protein investigators. As in BASIS this project will be informing the development of the Grid middleware it requires. e-htpx is addressing the problem of unifying the procedures of protein structure determination so that they can be accessed through a single interface which allows structural biologists to create models from the data generated by high throughput protein crystallography. This will involve creating new structure determination software which can take advantage of HPC computing facilities so that the results of the structure determination can be delivered on the same time scale as data collection. Data generated in these experiments will be stored at the EBI as an available resource for the research community. It will involve giving users access to instruments, data collections and analytic tools. This project is primarily concerned with building structural models within a structure level. As the system is expected to have a number of industrial users, an important concern is the authentication of users to protect the system against unauthorised use. The development team will be consulting potential industrial users to determine their requirements in this important area. This will be used to see whether this can be supported by Grid facilities. BioSimGrid aims to allow comparisons to be made of the results of multiple biomolecular simulations so that the structure of proteins and nucleic acids can be better understood. Its users will have access to large quantities of simulation data, which will be integrated in further simulation experiments testing theories in structural biology within structure levels with the capability to reuse this data in cross level modelling of structures. This data will require curation. These secondary analysis experiments will need data mining services to locate relevant data within these databases as well as data analysis tools. Some of the research community using this system will be working in commercial organisations in the pharmaceutical industry. This introduces the need for user authentication before access is allowed to some of the data and tools as commercial confidentiality will need to be protected. It also means they have an interest in investigating mechanisms for distributed authorisation and accounting. There is also a requirement to link the simulation data with other biological and structural data held in National repositories to allow development of richer, more sophisticated models. BDWorld [3] is creating a problem solving environment in which researchers can locate appropriate analytic tools and data resources held at different sites in the environment. These tools and data can then be linked in a work flow which produces results relevant to an investigation into a biodiversity problem at the species level. This may be a question such as what will be the effect on a species distribution if global warming occurs, or could this plant become invasive if it is introduced as an agricultural crop in a region. The system utilises a partial catalogue of life and other biotic and abiotic data, such as climate envelopes, in three exemplar studies to prove the concept biodiversity richness analysis; bioclimatic modelling and climate change; and phylogenetic analysis and biogeographic. This will involve the system linking heterogeneous legacy and current data collections so that it can interoperate on this data using a variety of software tools with different data format expectations. It is intended that this work will link with the GBIF system being developed in an international effort, as its resources will complement GBIFs. This system must be able to evolve by adding new data collections and
4 software tools to its distributed resources. This requires wrapping of legacy resources as they join so that they are consistent with the standards for data within BDWorld. This system is primarily concerned with the microbe, plant, animal level of biology although it will be able to support some lower level analyses. At the moment a basic BDWorld system is being built but its design is such that it will be able to evolve: by incorporating new tools and data resources; by adding ontologies which will help users discover the resources they need; by incorporating more sophisticated display tools. The designers are aware that Grid middleware is being created in parallel with their development of BDWorld and they are ensuring that it will be able to take advantage of appropriate Grid middleware when it has reached a suitable stage in its development. The neuroinformatics project [4] is investigating how the brain functions. It is intended that the developed system will allow neuroscientists to work collaboratively sharing their data and software tools. Research in this area needs to be undertaken at different biological levels and across the levels. The challenge is to allow the researchers to create their models collaboratively and conduct experiments on them. This involves being able to locate appropriate data so that it can be linked in the models or utilised by the models. This data and data produced by the models must be available for use in future experiments. This research is being undertaken in collaboration with scientists at the Newcastle escience centre who are looking after the database and Grid middleware aspects of this project in a separate project. The prime concern of this project at the moment is creating data models and software tools that enable heterogeneous data to be easily shared and analysed by its user community. The design team recognise that there will be a need in the next phase of its development for the system to provide sophisticated visualisation tools and ontologies. Thus they are concentrating on creating a basic system environment that can evolve by adding such tools in the future. 3. Pilot Project e-science themes These pilot projects display a number of escience themes. They are all aiming to support collaborative working within a research community who need to share data, results and software tools. This means they will need to support discovery of relevant data sources, data and their descriptions so that the different tools can analyse and share data. They will need to overcome heterogeneity in data representation when it is prepared over time for different purposes when accessing legacy data and develop new extended standards for the metadata describing this data which allows its provenance to be established and stored. Many of the projects are creating results which have to be stored so that other analytic tools can use this data in the future. This implies that the data will need to be curated with provenance showing how it was created and the tools creating it. There will also be a need to use and store descriptive data so that representation of the data is understood by researchers and can be interpreted by software tools. There is some need for High Performance Computing (HPC) but it is not a major requirement of the projects. It occurs when complex models are being built and the results of analysing and using the models are needed in real time for further analysis. E-HTPX is the only project seeing this as a prime requirement, although the others may need some access to these facilities in the future. All of the projects will be creating new software tools which need to be made available within the system for other users. These tools must be engineered so that they can link with existing tools and utilise the data available in the grid system. This means that there must be descriptions of these tools which enable them to be linked in analytic chains which can be executed by work flow engines. Several of the projects need work flow engines for their user interface to allow a user to create and execute work flows which perform the required analysis. Most of the projects are aiming to support the building of structural models at different biological levels, which can be used to determine the functioning of a biological system and the effect of change on the system. It is clear that as bioinformatics expands through the Grid this will become a growing area as researchers create more and more complex models that are not limited to one biological
5 level but interact across the levels to determine the effect of substructure change on the higher level structures. This growth in model complexity will also be reflected in a growing level of diversity in the data sources used in the models as researchers investigate more fully the causes of change and evolution. There is little emphasis at the moment on the need for sophisticated presentation tools which allow users to present information in different and more imaginative ways. This is probably due to modelling being mainly done at a single level at the moment. Another reason for this could be that the structural modelling of biological systems is a relatively new technique, and as it matures more sophisticated displays of the results from these complex models will be required by the modellers to make it easier for users to understand the outcomes. It is also a feature of the current state of development of the systems where these facilities are seen as the second stage of the development and an unnecessary luxury until the basic systems are working. Two projects are investigating the use of the Grid authentication techniques. This is due to the nature of the projects which have industrial links at the moment, rather than it not being of concern in the area. Again as the field matures and this type of analysis becomes more accepted there will be an increase in this requirement. 4. Expected effect on escience It is clear that the pilot projects have fairly ambitious bioinformatics goals and that they do not see themselves developing middleware per se for the Grid, but co-operating with the projects that are developing the middleware. The e-protein and neuroinformatics projects are working closely with research groups that are developing middleware in separate projects and will utilise this software as it becomes available. The e-protein team are working closely with the team developing the ICENI middleware at Imperial College and the neuroinformatics have a close link with the team developing the OGSA DAI middleware at Newcastle University. These projects will have a direct influence on the development of these pieces of escience middleware. The other projects are keeping themselves informed of middleware developments and will utilise appropriate middleware, when it is in a stable enough form, until then they are likely to use alternative nongeneric, limited capability software that is available or they developed themselves to meet their needs. However they all intend to take full advantage of Grid middleware when it is stable. All the projects have major data handling challenges and one of the major contributions from this research programme should be insight into the future metadata requirements and standards in Grid environments. This covers the description of data and software as well as provenance and curation of data. A major issue through all the projects is the interoperation of data held in different data collections. Considerable insight should be gained from these pilots as to how to describe and hold data so that this task is facilitated, especially with respect to the wrapping of legacy systems so that they can enter new environments easily. Although it is not a major feature at the moment, these systems will need metadata repositories and ontologies to help users identify the resources held in the environments that they require. These will be needed to help the researchers create the workflows that will do their analyses of the data. At the moment some of the projects are investigating or creating embryonic workflow engines which will be used to execute the analytic chains of software tools which are created by users to identify the required analysis. This work should further inform the development of these workflow engines. These projects all intend to develop basic systems which can evolve to meet future as yet unknown requirements. This will be an important feature of their system architectures and the development of these pilots should give us more insight into the best ways of building systems with this capability. This will be important in the development of the sophisticated Grid systems as we will not be able to afford to recreate such systems from scratch. 5. Conclusions The bioinformatics pilot projects will make meaningful contributions to the escience
6 programme in the areas of creating the metadata standards required for bioinformatics data and tools. They will contribute to the definition of data curation and provenance standards. They do not intend to make a direct contribution to the development of the middleware required for the Grid but their use of it as it evolves will inform the development of this middleware. They will also identify new middleware requirements. It is clear that in the future this research will inform the development of the next generation of ontologies and data/resource discovery tools and the more sophisticated presentation tools such as result visualisation. However the major contribution of these pilots will be as catalysts which encourage more bioinformatics research by demonstrating what can be achieved by collaborative in silico data experimentation and analysis in bioscience. The pilots will also create the basic systems which will allow the next generation of researchers to fully exploit this capability. This will enable the field to grow and support the collaborative working needed to build and exploit the next generation of systems biology models. References 1. World Class Bioscience, Strategic Plan , BBSRC, Swindon (2003) 2. Kirkwood TBL et al: Towards an e-biology of ageing: integrating theory and data, Nature Reviews Molecular Cell Biology 4, (2003) 3. Bisby FA: Biodiversity Informatics, in Business (quarterly magazine of the BBSRC), 24-25, July Goddard N, Cannon R and Howell F: Axiope Tools for Data Management and Data Sharing, accepted by J Neuroinformatics to be published (2003)
BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS
BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS 1. The Technology Strategy sets out six areas where technological developments are required to push the frontiers of knowledge
More informationDr Alexander Henzing
Horizon 2020 Health, Demographic Change & Wellbeing EU funding, research and collaboration opportunities for 2016/17 Innovate UK funding opportunities in omics, bridging health and life sciences Dr Alexander
More informationMAC Consultation on the Review of the Shortage Occupation Lists for the UK and Scotland and Creative Occupations.
Response by the Research Councils UK To MAC Consultation on the Review of the Shortage Occupation Lists for the UK and Scotland and Creative Occupations. Background Research Councils UK (RCUK) is the strategic
More informationUniversity of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology
University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology Programme Structure - the MSc outcome will require 180 credits total (full-time only) - 60
More informationExploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing.
Exploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing. Dr Liz Lyon, UKOLN, University of Bath Introduction and Objectives UKOLN is undertaking
More informationINFRASTRUCTURE PROGRAMME
INFRASTRUCTURE PROGRAMME programme remit Infrastructure s remit is to provide researchers in the Engineering and Physical sciences with access to the world class facilities necessary to enable internationally
More informationIO Informatics The Sentient Suite
IO Informatics The Sentient Suite Our software, The Sentient Suite, allows a user to assemble, view, analyze and search very disparate information in a common environment. The disparate data can be numeric
More informationThe UK e-science Programme and the Grid. Tony Hey Director of UK e-science Core Programme Tony.Hey@epsrc.ac.uk
The UK e-science Programme and the Grid Tony Hey Director of UK e-science Core Programme Tony.Hey@epsrc.ac.uk e-science and the Grid e-science is about global collaboration in key areas of science, and
More informationEuro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences
Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences WP11 Data Storage and Analysis Task 11.1 Coordination Deliverable 11.2 Community Needs of
More informationCHALLENGES OF BIG DATA THE
THE CHALLENGES OF BIG DATA The AHRC has an important leadership role in supporting the development of the new or enhanced skills and competencies in order to fully exploit the potential of big data across
More informationGrids, e-business and e-utilities. Tony Hey Director of the UK e-science Core Programme EPSRC and DTI Tony.Hey@epsrc.ac.uk
Grids, e-business and e-utilities Tony Hey Director of the UK e-science Core Programme EPSRC and DTI Tony.Hey@epsrc.ac.uk The Grid as an Enabler for Virtual Organisations Ian Foster, Carl Kesselman and
More informationBiochemistry Major Talk 2014-15. Welcome!!!!!!!!!!!!!!
Biochemistry Major Talk 2014-15 August 14, 2015 Department of Biochemistry The University of Hong Kong Welcome!!!!!!!!!!!!!! Introduction to Biochemistry A four-minute video: http://www.youtube.com/watch?v=tpbamzq_pue&l
More informationThe Importance of Bioinformatics and Information Management
A Graduate Program for Biological Information Specialists 1 Bryan Heidorn, Carole Palmer, and Dan Wright Graduate School of Library and Information Science University of Illinois at Urbana-Champaign UIUC
More informationWorkprogramme 2014-15
Workprogramme 2014-15 e-infrastructures DCH-RP final conference 22 September 2014 Wim Jansen einfrastructure DG CONNECT European Commission DEVELOPMENT AND DEPLOYMENT OF E-INFRASTRUCTURES AND SERVICES
More informationUniversity Uses Business Intelligence Software to Boost Gene Research
Microsoft SQL Server 2008 R2 Customer Solution Case Study University Uses Business Intelligence Software to Boost Gene Research Overview Country or Region: Scotland Industry: Education Customer Profile
More informationAP Biology Essential Knowledge Student Diagnostic
AP Biology Essential Knowledge Student Diagnostic Background The Essential Knowledge statements provided in the AP Biology Curriculum Framework are scientific claims describing phenomenon occurring in
More informatione-science Technologies in Synchrotron Radiation Beamline - Remote Access and Automation (A Case Study for High Throughput Protein Crystallography)
Macromolecular Research, Vol. 14, No. 2, pp 140-145 (2006) e-science Technologies in Synchrotron Radiation Beamline - Remote Access and Automation (A Case Study for High Throughput Protein Crystallography)
More informationTeaching Computational Thinking using Cloud Computing: By A/P Tan Tin Wee
Teaching Computational Thinking using Cloud Computing: By A/P Tan Tin Wee Technology in Pedagogy, No. 8, April 2012 Written by Kiruthika Ragupathi (kiruthika@nus.edu.sg) Computational thinking is an emerging
More informationData Curation for the Long Tail of Science: The Case of Environmental Sciences
Data Curation for the Long Tail of Science: The Case of Environmental Sciences Carole L. Palmer, Melissa H. Cragin, P. Bryan Heidorn, Linda C. Smith Graduate School of Library and Information Science University
More informationIntegrating Research Information: Requirements of Science Research
Integrating Research Information: Requirements of Science Research Brian Matthews Scientific Information Group E-Science Centre STFC Rutherford Appleton Laboratory brian.matthews@stfc.ac.uk The science
More informationIntegrated Rule-based Data Management System for Genome Sequencing Data
Integrated Rule-based Data Management System for Genome Sequencing Data A Research Data Management (RDM) Green Shoots Pilots Project Report by Michael Mueller, Simon Burbidge, Steven Lawlor and Jorge Ferrer
More informationA Capability Maturity Model for Scientific Data Management
A Capability Maturity Model for Scientific Data Management 1 A Capability Maturity Model for Scientific Data Management Kevin Crowston & Jian Qin School of Information Studies, Syracuse University July
More informationSummary of Responses to the Request for Information (RFI): Input on Development of a NIH Data Catalog (NOT-HG-13-011)
Summary of Responses to the Request for Information (RFI): Input on Development of a NIH Data Catalog (NOT-HG-13-011) Key Dates Release Date: June 6, 2013 Response Date: June 25, 2013 Purpose This Request
More informationUsing the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova
Using the Grid for the interactive workflow management in biomedicine Andrea Schenone BIOLAB DIST University of Genova overview background requirements solution case study results background A multilevel
More informationSemantic and Personalised Service Discovery
Semantic and Personalised Service Discovery Phillip Lord 1, Chris Wroe 1, Robert Stevens 1,Carole Goble 1, Simon Miles 2, Luc Moreau 2, Keith Decker 2, Terry Payne 2 and Juri Papay 2 1 Department of Computer
More informationDigital libraries of the future and the role of libraries
Digital libraries of the future and the role of libraries Donatella Castelli ISTI-CNR, Pisa, Italy Abstract Purpose: To introduce the digital libraries of the future, their enabling technologies and their
More informationSurvey of Canadian and International Data Management Initiatives. By Diego Argáez and Kathleen Shearer
Survey of Canadian and International Data Management Initiatives By Diego Argáez and Kathleen Shearer on behalf of the CARL Data Management Working Group (Working paper) April 28, 2008 Introduction Today,
More informationBIOINFORMATICS Supporting competencies for the pharma industry
BIOINFORMATICS Supporting competencies for the pharma industry ABOUT QFAB QFAB is a bioinformatics service provider based in Brisbane, Australia operating nationwide and internationally. QFAB was established
More informatione-science and technology infrastructure for biodiversity research
e-science and technology infrastructure for biodiversity research Wouter Los Coordinator of the Preparatory Project University of Amsterdam (institute of Biodiversity and Ecosystem Dynamics) Outline Users
More informationOGSA - A Guide to Data Access and Integration in UK
The OGSA-DAI Project Databases and the Grid Neil Chue Hong Principal Consultant EPCC, Edinburgh N.ChueHong@epcc.ed.ac.uk What is OGSA-DAI? 4It is a project: OGSA Data Access and Integration: funded by
More informationBIOINFORMATICS METHODS AND APPLICATIONS
FACULTY of ENGINEERING SCHOOL OF COMPUTER SCIENCE AND ENGINEERING BINF3010/9010 BIOINFORMATICS METHODS AND APPLICATIONS SESSION 1, 2015 Course staff Course Convener: Bruno Gaëta bgaeta@unsw.edu.au School
More informationSemantic and Personalised Service Discovery
Semantic and Personalised Service Discovery Phillip Lord 1, Chris Wroe 1, Robert Stevens 1,Carole Goble 1, Simon Miles 2, Luc Moreau 2, Keith Decker 2, Terry Payne 2 and Juri Papay 2 1 Department of Computer
More informationScientific versus Business Workflows
2 Scientific versus Business Workflows Roger Barga and Dennis Gannon The formal concept of a workflow has existed in the business world for a long time. An entire industry of tools and technology devoted
More informationRESPONSE FROM GBIF TO QUESTIONS FOR FURTHER CONSIDERATION
RESPONSE FROM GBIF TO QUESTIONS FOR FURTHER CONSIDERATION A. Policy support tools and methodologies developed or used under the Convention and their adequacy, impact and obstacles to their uptake, as well
More informationCase Study Life Sciences Data
Case Study Life Sciences Data Centre for Integrative Systems Biology and Bioinformatics www.imperial.ac.uk/bioinfsupport Sarah Butcher s.butcher@imperial.ac.uk www.imperial.ac.uk/bioinfsupport Bio-data
More informationVirtual research environments: learning gained from a situation and needs analysis for malaria researchers
Virtual research environments: learning gained from a situation and needs analysis for malaria researchers Martie van Deventer (CSIR), Heila Pienaar (UP), Jane Morris (ACGT) and Zoleka Ngcete (SAMI) African
More informationThe 100,000 genomes project
The 100,000 genomes project Tim Hubbard @timjph Genomics England King s College London, King s Health Partners Wellcome Trust Sanger Institute ClinGen / Decipher Washington DC, 26 th May 2015 The 100,000
More informationEMBL Identity & Access Management
EMBL Identity & Access Management Rupert Lück EMBL Heidelberg e IRG Workshop Zürich Apr 24th 2008 Outline EMBL Overview Identity & Access Management for EMBL IT Requirements & Strategy Project Goal and
More informationINRA's Big Data perspectives and implementation challenges. Pascal Neveu UMR MISTEA INRA - Montpellier
INRA's Big Data perspectives and implementation challenges UMR MISTEA INRA - Montpellier Agronomic Sciences Raises integrated issues and challenges: How to adapt agriculture to climate change? How agriculture
More informationEUDAT. Towards a pan-european Collaborative Data Infrastructure
EUDAT Towards a pan-european Collaborative Data Infrastructure Damien Lecarpentier CSC-IT Center for Science, Finland EISCAT User Meeting, Uppsala,6 May 2013 2 Exponential growth Data trends Zettabytes
More informationEuro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences
Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences WP11 Data Storage and Analysis Task 11.1 Coordination Deliverable 11.3 Selected Standards
More informationProgramme Specification (Undergraduate) Date amended: August 2012
Programme Specification (Undergraduate) Date amended: August 2012 1. Programme Title(s) and UCAS code(s): BSc Biological Sciences C100 BSc Biological Sciences (Biochemistry) C700 BSc Biological Sciences
More informationA CONTENT STANDARD IS NOT MET UNLESS APPLICABLE CHARACTERISTICS OF SCIENCE ARE ALSO ADDRESSED AT THE SAME TIME.
Biology Curriculum The Georgia Performance Standards are designed to provide students with the knowledge and skills for proficiency in science. The Project 2061 s Benchmarks for Science Literacy is used
More informationContents. Page 1 of 11
Programme-specific Section of the Curriculum for the MSc Programme in Bioinformatics at the Faculty of Science, University of Copenhagen 2009 (Rev. 2015) Contents 1 Title, affiliation and language... 2
More informationThe Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets
The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets!! Large data collections appear in many scientific domains like climate studies.!! Users and
More informationOpen Source Software in Life Science Research. Woodhead Publishing Series in Biomedicine
Brochure More information from http://www.researchandmarkets.com/reports/2719842/ Open Source Software in Life Science Research. Woodhead Publishing Series in Biomedicine Description: The free/open source
More informationBrain Segmentation A Case study of Biomedical Cloud Computing for Education and Research
Brain Segmentation A Case study of Biomedical Cloud Computing for Education and Research Victor Chang Leeds Metropolitan University (and University of Southampton) School of Computing and Creative Technologies,
More informationGlobal Scientific Data Infrastructures: The Big Data Challenges. Capri, 12 13 May, 2011
Global Scientific Data Infrastructures: The Big Data Challenges Capri, 12 13 May, 2011 Data-Intensive Science Science is, currently, facing from a hundred to a thousand-fold increase in volumes of data
More informationSoaplab - a unified Sesame door to analysis tools
Soaplab - a unified Sesame door to analysis tools Martin Senger, Peter Rice, Tom Oinn European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK http://industry.ebi.ac.uk/soaplab Abstract
More informationNetherlands escience Center
Netherlands escience Center ICT Synergy Hub, Amsterdam Research & Innovation in the Big Data Era CWI in Bedrijf Centrum Wiskunde & Informatica Op 5 oktober 2012 Prof. dr. Jacob de Vlieg ¹ ² 1. CEO & Scientific
More informationKeystones for supporting collaborative research using multiple data sets in the medical and bio-sciences
Keystones for supporting collaborative research using multiple data sets in the medical and bio-sciences David Fergusson Head of Scientific Computing The Francis Crick Institute The Francis Crick Institute
More informationPipeline Pilot Enterprise Server. Flexible Integration of Disparate Data and Applications. Capture and Deployment of Best Practices
overview Pipeline Pilot Enterprise Server Pipeline Pilot Enterprise Server (PPES) is a powerful client-server platform that streamlines the integration and analysis of the vast quantities of data flooding
More informationThe challenge of managing research data. Axel Berg
The challenge of managing research data Axel Berg Context The data deluge cannot be stopped Without adequate data management: - the ever-growing amounts and complexity of data will be non-controllable
More informationEnvironmental Research and Innovation ( ERIN )
RDI Department Environmental Research and Innovation ( ERIN ) LIST s Environmental Research & Innovation (ERIN) department develops strategies, technologies and tools to better monitor, assess, use and
More informationEfficient Data Storage and Analysis for Generic Biomolecular Simulation Data
Efficient Data Storage and Analysis for Generic Biomolecular Simulation Data Muan Hong Ng 1, Steven Johnston 1, Stuart Murdock 2, Bing Wu 3, Kaihsu Tai 4, Hans Fangohr 1, Simon Cox 1, Jonathan W. Essex
More informationDescribing Web Services for user-oriented retrieval
Describing Web Services for user-oriented retrieval Duncan Hull, Robert Stevens, and Phillip Lord School of Computer Science, University of Manchester, Oxford Road, Manchester, UK. M13 9PL Abstract. As
More informationCYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21)
CYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21) Goal Develop and deploy comprehensive, integrated, sustainable, and secure cyberinfrastructure (CI) to accelerate research
More informationCloud Computing for e-science with CARMEN
Cloud Computing for e-science with CARMEN Paul Watson, Phillip Lord, Frank Gibson, Panayiotis Periorellis, Georgios Pitsilis School of Computing Science, Newcastle University, Newcastle-upon-Tyne, UK Paul.Watson@newcastle.ac.uk
More informationTHE BRITISH LIBRARY. Unlocking The Value. The British Library s Collection Metadata Strategy 2015-2018. Page 1 of 8
THE BRITISH LIBRARY Unlocking The Value The British Library s Collection Metadata Strategy 2015-2018 Page 1 of 8 Summary Our vision is that by 2020 the Library s collection metadata assets will be comprehensive,
More informationTranslational research facilitating experimental medicine in dementia in the UK
Translational research facilitating experimental medicine in dementia in the UK Simon Lovestone Director Biomedical Research Unit for dementia at Maudsley and King s Route Map for Dementia Research June
More informationHealthcare, transportation,
Smart IT Argus456 Dreamstime.com From Data to Decisions: A Value Chain for Big Data H. Gilbert Miller and Peter Mork, Noblis Healthcare, transportation, finance, energy and resource conservation, environmental
More informationUsing Ontologies in Proteus for Modeling Data Mining Analysis of Proteomics Experiments
Using Ontologies in Proteus for Modeling Data Mining Analysis of Proteomics Experiments Mario Cannataro, Pietro Hiram Guzzi, Tommaso Mazza, and Pierangelo Veltri University Magna Græcia of Catanzaro, 88100
More informationFunding New Innovations in Synthetic Biology
Funding New Innovations in Synthetic Biology Dr Belinda Clarke Lead Technologist, Synthetic Biology email: belinda.clarke@tsb.gov.uk twitter: @Belinda_Clarke Janus Transitions and change The past and the
More informationSupporting Collaborative Grid Application Development Within The E-Science Community p. 1
Supporting Collaborative Grid Application Development Within The E-Science Community Supporting Collaboration within the e-science Community Cornelia Boldyreff, David Nutter & Stephen Rank http://www.lincoln.ac.uk/faculties/computing/index.html
More informationSoftware Description Technology
Software applications using NCB Technology. Software Description Technology LEX Provide learning management system that is a central resource for online medical education content and computer-based learning
More informationAn Interdepartmental Ph.D. Program in Computational Biology and Bioinformatics:
An Interdepartmental Ph.D. Program in Computational Biology and Bioinformatics: The Yale Perspective Mark Gerstein, Ph.D. 1,2, Dov Greenbaum 1, Kei Cheung, Ph.D. 3,4,5, Perry L. Miller, M.D., Ph.D. 3,4,6
More informationOvercoming the Technical and Policy Constraints That Limit Large-Scale Data Integration
Overcoming the Technical and Policy Constraints That Limit Large-Scale Data Integration Revised Proposal from The National Academies Summary An NRC-appointed committee will plan and organize a cross-disciplinary
More informationHuman Brain Project -
Human Brain Project - Scientific goals, Organization, Our role Wissenswerte, Bremen 26. Nov 2013 Prof. Sonja Grün Insitute of Neuroscience and Medicine (INM-6) & Institute for Advanced Simulations (IAS-6)
More informationTHE CCLRC DATA PORTAL
THE CCLRC DATA PORTAL Glen Drinkwater, Shoaib Sufi CCLRC Daresbury Laboratory, Daresbury, Warrington, Cheshire, WA4 4AD, UK. E-mail: g.j.drinkwater@dl.ac.uk, s.a.sufi@dl.ac.uk Abstract: The project aims
More informationInformatics and Knowledge Management at the Novartis Institutes for BioMedical Research (NIBR)
Informatics and Knowledge Management at the Novartis Institutes for BioMedical Research (NIBR) Enable Science in silico & Provide the Right Knowledge to the Right People at the Right Time to enable the
More informationLife as a scientific database curator
Life as a scientific database curator Sandra Orchard EBI is an Outstation of the European Molecular Biology Laboratory. What is a database curator Curator OED - a keeper of a museum or other collection
More informationSchool of Biosciences: MRC Phenome Centre-Birmingham
University of Birmingham College of Life and Environmental Sciences School of Biosciences: MRC Phenome Centre-Birmingham Bioinformatics (Metabolomics-specific) Research Fellow 1 Salary from 28,695 to 37,394
More informationLinked Science as a producer and consumer of big data in the Earth Sciences
Linked Science as a producer and consumer of big data in the Earth Sciences Line C. Pouchard,* Robert B. Cook,* Jim Green,* Natasha Noy,** Giri Palanisamy* Oak Ridge National Laboratory* Stanford Center
More informationInformation and Communications Technology Strategy 2014-2017
Contents 1 Background ICT in Geoscience Australia... 2 1.1 Introduction... 2 1.2 Purpose... 2 1.3 Geoscience Australia and the Role of ICT... 2 1.4 Stakeholders... 4 2 Strategic drivers, vision and principles...
More informationClinical Research Infrastructure
Clinical Research Infrastructure Enhancing UK s Clinical Research Capabilities & Technologies At least 150m to establish /develop cutting-edge technological infrastructure, UK wide. to bring into practice
More informationBIOSCIENCES COURSE TITLE AWARD
COURSE TITLE AWARD BIOSCIENCES As a Biosciences undergraduate student at the University of Westminster, you will benefit from some of the best teaching and facilities available. Our courses combine lecture,
More informationScience for a healthy society. Food Safety & Security. Food Databanks. Food & Health. Industrial Biotechnology. Gut Health
Food Safety & Security Food Databanks Food & Health Industrial Biotechnology National Collection of Yeast Cultures Gut Health Science for a healthy society 1 invested in IFR delivers over 8 of benefits
More informationTHe evolution of analytical lab InForMaTICs
Informatics Strategies for the Convergent Analytical Lab TECHNOLOGY REVIEW In many labs today, the drive to replace paper has begun pitting two systems against each other. The functionality in LIMS, which
More informationParadigm Changes Affecting the Practice of Scientific Communication in the Life Sciences
Paradigm Changes Affecting the Practice of Scientific Communication in the Life Sciences Prof. Dr. Martin Hofmann-Apitius Head of the Department of Bioinformatics Fraunhofer Institute for Algorithms and
More informationIntro to Data Management. Chris Jordan Data Management and Collections Group Texas Advanced Computing Center
Intro to Data Management Chris Jordan Data Management and Collections Group Texas Advanced Computing Center Why Data Management? Digital research, above all, creates files Lots of files Without a plan,
More informationNIH Commons Overview, Framework & Pilots - Version 1. The NIH Commons
The NIH Commons Summary The Commons is a shared virtual space where scientists can work with the digital objects of biomedical research, i.e. it is a system that will allow investigators to find, manage,
More informationIn Vivo In Silico (ivis): the Virtual Worm, Weed and Bug
GC1 In Vivo In Silico (ivis): the Virtual Worm, Weed and Bug Ronan Sleep We routinely use massively powerful computer simulations and visualisations to design aeroplanes, build bridges and to predict weather.
More informationResearch and Innovation Strategy: delivering a flexible workforce receptive to research and innovation
Research and Innovation Strategy: delivering a flexible workforce receptive to research and innovation Contents List of Abbreviations 3 Executive Summary 4 Introduction 5 Aims of the Strategy 8 Objectives
More informationProgramme Specification (2014-15): MSc in Bioinformatics and Computational Genomics
Date of Revision Date of Previous Revision Programme Specification (2014-15): MSc in Bioinformatics and Computational Genomics A programme specification is required for any programme on which a student
More informationFamilies of Database Schemas for Neuroscience Experiments
Families of Database Schemas for Neuroscience Experiments Larissa Cristina Moraes Advisor: Kelly Rosa Braghetto Institute of Mathematics and Statistics - University of São Paulo May 6th, 2015 1 / 16 Agenda
More informationThe cross-disciplinary Roots of the British collaboration between scholars in humanities and
HALOGEN RESEARCH DATA MANAGEMENT BENEFITS CASE STUDY 1. BACKGROUND The cross-disciplinary Roots of the British collaboration between scholars in humanities and genetics at the University of Leicester (Wellcome
More informationPROGRAMME SPECIFICATION
PROGRAMME SPECIFICATION 1 Awarding Institution: University of Exeter 2 School(s)/Teaching Institution: School of Biosciences 3 Programme accredited/validated by: 4 Final Award(s): MSc Medical Informatics
More informationGRADUATE CATALOG LISTING
GRADUATE CATALOG LISTING 1 BIOINFORMATICS & COMPUTATIONAL BIOLOGY Telephone: (302) 831-0161 http://bioinformatics.udel.edu/education Faculty Listing: http://bioinformatics.udel.edu/education/faculty A.
More informationMEng, BSc Computer Science with Artificial Intelligence
School of Computing FACULTY OF ENGINEERING MEng, BSc Computer Science with Artificial Intelligence Year 1 COMP1212 Computer Processor Effective programming depends on understanding not only how to give
More informationCALIFORNIA STATE UNIVERSITY CHANNEL ISLANDS
CALIFORNIA STATE UNIVERSITY CHANNEL ISLANDS PROGRAM MODIFICATION DATE: 12.06.06 PROGRAM AREA: BIOLOGY AND BUSINESS AND ECONOMICS SEMESTER /YEAR FIRST EFFECTED: FALL 2007 Please use the following format
More informationTier 2 Canada Research Chair in Bioinformatics Additional Information for Potential Applicants
St. Francis Xavier University (StFX) Tier 2 Canada Research Chair in Bioinformatics Additional Information for Potential Applicants Founded in 1853, St. Francis Xavier University (StFX) has a long and
More informationReport of the DTL focus meeting on Life Science Data Repositories
Report of the DTL focus meeting on Life Science Data Repositories Goal The goal of the meeting was to inform and discuss research data repositories for life sciences. The big data era adds to the complexity
More informationService Road Map for ANDS Core Infrastructure and Applications Programs
Service Road Map for ANDS Core and Applications Programs Version 1.0 public exposure draft 31-March 2010 Document Target Audience This is a high level reference guide designed to communicate to ANDS external
More informationJust the Facts: A Basic Introduction to the Science Underlying NCBI Resources
1 of 8 11/7/2004 11:00 AM National Center for Biotechnology Information About NCBI NCBI at a Glance A Science Primer Human Genome Resources Model Organisms Guide Outreach and Education Databases and Tools
More informationEDITORIAL MINING FOR GOLD : CAPITALISING ON DATA TO TRANSFORM DRUG DEVELOPMENT. A Changing Industry. What Is Big Data?
EDITORIAL : VOL 14 ISSUE 1 BSLR 3 Much has been written about the potential of data mining big data to transform drug development, reduce uncertainty, facilitate more targeted drug discovery and make more
More informationBIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology http://tinyurl.com/bioinf525-w16
Course Director: Dr. Barry Grant (DCM&B, bjgrant@med.umich.edu) Description: This is a three module course covering (1) Foundations of Bioinformatics, (2) Statistics in Bioinformatics, and (3) Systems
More informationIT Challenges for the Library and Information Studies Sector
IT Challenges for the Library and Information Studies Sector This document is intended to facilitate and stimulate discussion at the e-science Scoping Study Expert Seminar for Library and Information Studies.
More informationGlobal Ecology and Wildlife Conservation
Vaughan Centre for Lifelong Learning Part-Time Certificate of Higher Education in Global Ecology and Wildlife Conservation Delivered via Distance Learning FAQs What are the aims of the course? This course
More informationStatement of ethical principles for biotechnology in Victoria
Statement of ethical principles for biotechnology in Victoria Statement of ethical principles for biotechnology in Victoria Acknowledgments Published by the Public Health Group, Rural & Regional Health
More information