A demonstration of the use of Datagrid testbed and services for the biomedical community

Size: px
Start display at page:

Download "A demonstration of the use of Datagrid testbed and services for the biomedical community"

Transcription

1 A demonstration of the use of Datagrid testbed and services for the biomedical community Biomedical applications work package V. Breton, Y Legré (CNRS/IN2P3) R. Météry (CS) Credits : C. Blanchet, T. Contamine, S. Gadras, M. Joubert, A.Minne, J. Montagnat

2 The Visual DataGrid Blast A graphical interface to enter query sequences and select the reference database A script to execute the algorithm on the grid A graphical interface to analyze results

3 WP10 When/Where do biologists use? (When?) The first step for analysing new sequences: to compare DNA or protein sequences to other ones: stored in personal or public databases (Where?) in a laboratory with an updated version of the genomics and post-genomics data banks Requires equipment to store databases and run algorithms Requires manpower for system & network maintenance and frequent update of databases Most biologists use integrated web portals for their genomics comparative analysis: no need to worry about the biological file format and the method arguments

4 Web portals for biologists under growing pressure Biologist enters sequences through web interface Pipelined execution of bio-informatics algorithms Genomics comparative analysis Phylogenetics 2D, 3D molecular structure of proteins The algorithms are executed on a local cluster Big labs have big clusters But growing pressure More and more biologists compare larger and larger sequences (whole genomes) to more and more genomes with fancier and fancier algorithms!!

5 UI JDL Executing on the grid Input Sandbox : Input sequences Replica Catalog Job Submit Event Output Sandbox : result Resource Broker Job Submission Service Information Service Job Status Logging & Bookkeeping Computing Element Storage Element Credit : Fabio Hernandez

6 WP10 WP10 WP10 Actual demonstration Seq1 > dcdcdsc bscvbfvbvfbvbvbhvbh svbhdvbhfdbvfd Seq2 > bvdfvfdvhbdfvb bhvdsvbhvbhdvrefghefgdscgdf gcsdycgdkcsqkc Seqn > bvdfvfdvhbdfvb bhvdsvbhvbhdvrefghefgdscgdf gcsdycgdkcsqkchdsqhfduhdhd hqedezhhezldhezhfehflezfzejfv RESULT vbfvbvfbvbvbhvb hsvbhdvbhfdbvfdbvdfvfdvhbdfvbh dbhvdsvbhvbhdvrefghefgdscgdfgc sdycgdkcsqkcqhdsqhfduhdhdhqede zhdhezldhezhfehflezfzeflehfhezfhe hfezhflezhflhfhfelhfehflzlhfzdjazsl zdhfhfdfezhfehfizhflqfhduhsdslchl kchudcscscdscdscdscsddzdzeqvnvq vnq! Vqlvkndlkvnldwdfbwdfbdbd wdfbfbndblnblkdnblkdbdfbwfdbfn UI Computing element Computing element Input file Computing element Seq1 > dsbcbjbd fndfjvbndfbnbnfb jnbjxbnxbjk:nxbf Seq2 > dsbcbjbd fndfjvbndfbnbnfb jnbjxbnxbjk:nxbf Seqn > dsbcbjbd fndfjvbndfbnbnfb jnbjxbnxbjk:nxbf

7 The Grid impact on computing Swissprot vs Swissprot ( sequences) Running time on one CPU : 228 hours Tests at Institut de Biologie et Chimie des Protéines (quadripro) : ~49 hours Tests on DataGrid (cc-in2p3) : 3 hours Impacts : Reduced pressure on local computing Ability to handle very large jobs

8 The grid impact on data handling DataGrid will allow mirroring of databases An alternative to the current costly replication mechanism Allowing web portals on the grid to access updated databases Trembl(EBI) Biomedical Replica Catalog Swissprot (Geneva)

9 This demo illustrates how grids can bring a revolution to genomics Grids expand the performances of genomics web portals Distributed execution of bio-informatics algorithms, Even the ones requiring huge amount of CPU Maintenance of up-to-date biological databases over the network Grids open new perspectives in large scale genomics analysis Complete genome annotation Cross-genomes analysis Data mining on distributed databases Pipelining of huge automatic bio-informatics analysis

EGEE-2 NA4 Biomed Bioinformatics in CNRS

EGEE-2 NA4 Biomed Bioinformatics in CNRS Enabling Grids for E-sciencE EGEE-2 NA4 Biomed Bioinformatics in CNRS Christophe Blanchet Institute of Biology and Chemistry of Proteins Lyon, April 28, 2006 www.eu-egee.org Enabling Grids for E-sciencE

More information

Bioinformatics Grid - Enabled Tools For Biologists.

Bioinformatics Grid - Enabled Tools For Biologists. Bioinformatics Grid - Enabled Tools For Biologists. What is Grid-Enabled Tools (GET)? As number of data from the genomics and proteomics experiment increases. Problems arise for the current sequence analysis

More information

Roberto Barbera. Centralized bookkeeping and monitoring in ALICE

Roberto Barbera. Centralized bookkeeping and monitoring in ALICE Centralized bookkeeping and monitoring in ALICE CHEP INFN 2000, GRID 10.02.2000 WP6, 24.07.2001 Roberto 1 Barbera ALICE and the GRID Phase I: AliRoot production The GRID Powered by ROOT 2 How did we get

More information

EGEE is a project funded by the European Union under contract IST-2003-508833

EGEE is a project funded by the European Union under contract IST-2003-508833 www.eu-egee.org NA4 Applications F.Harris(Oxford/CERN) NA4/HEP coordinator EGEE is a project funded by the European Union under contract IST-2003-508833 Talk Outline The basic goals of NA4 The organisation

More information

BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS

BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS NEW YORK CITY COLLEGE OF TECHNOLOGY The City University Of New York School of Arts and Sciences Biological Sciences Department Course title:

More information

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the

More information

Module 1. Sequence Formats and Retrieval. Charles Steward

Module 1. Sequence Formats and Retrieval. Charles Steward The Open Door Workshop Module 1 Sequence Formats and Retrieval Charles Steward 1 Aims Acquaint you with different file formats and associated annotations. Introduce different nucleotide and protein databases.

More information

A Laboratory Information. Management System for the Molecular Biology Lab

A Laboratory Information. Management System for the Molecular Biology Lab A Laboratory Information L I M S Management System for the Molecular Biology Lab This Document Overview Why LIMS? LIMS overview Why LIMS? Current uses LIMS software Design differences LIMS software LIMS

More information

G E N OM I C S S E RV I C ES

G E N OM I C S S E RV I C ES GENOMICS SERVICES THE NEW YORK GENOME CENTER NYGC is an independent non-profit implementing advanced genomic research to improve diagnosis and treatment of serious diseases. capabilities. N E X T- G E

More information

SGI. High Throughput Computing (HTC) Wrapper Program for Bioinformatics on SGI ICE and SGI UV Systems. January, 2012. Abstract. Haruna Cofer*, PhD

SGI. High Throughput Computing (HTC) Wrapper Program for Bioinformatics on SGI ICE and SGI UV Systems. January, 2012. Abstract. Haruna Cofer*, PhD White Paper SGI High Throughput Computing (HTC) Wrapper Program for Bioinformatics on SGI ICE and SGI UV Systems Haruna Cofer*, PhD January, 2012 Abstract The SGI High Throughput Computing (HTC) Wrapper

More information

Introduction to bioknoppix: Linux for the life sciences

Introduction to bioknoppix: Linux for the life sciences Introduction to bioknoppix: Linux for the life sciences Carlos M Rodríguez Rivera Humberto Ortiz Zuazaga Who are we? Short: Bunch of computer geeks. Long: The High Performance Computing facility of the

More information

BioHPC Web Computing Resources at CBSU

BioHPC Web Computing Resources at CBSU BioHPC Web Computing Resources at CBSU 3CPG workshop Robert Bukowski Computational Biology Service Unit http://cbsu.tc.cornell.edu/lab/doc/biohpc_web_tutorial.pdf BioHPC infrastructure at CBSU BioHPC Web

More information

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences WP11 Data Storage and Analysis Task 11.1 Coordination Deliverable 11.2 Community Needs of

More information

UGENE Quick Start Guide

UGENE Quick Start Guide Quick Start Guide This document contains a quick introduction to UGENE. For more detailed information, you can find the UGENE User Manual and other special manuals in project website: http://ugene.unipro.ru.

More information

The Lattice Project: A Multi-Model Grid Computing System. Center for Bioinformatics and Computational Biology University of Maryland

The Lattice Project: A Multi-Model Grid Computing System. Center for Bioinformatics and Computational Biology University of Maryland The Lattice Project: A Multi-Model Grid Computing System Center for Bioinformatics and Computational Biology University of Maryland Parallel Computing PARALLEL COMPUTING a form of computation in which

More information

Introduction to Grid computing

Introduction to Grid computing Introduction to Grid computing The INFNGrid Project Team Introduction This tutorial has been implemented considering as starting point the DataGrid (EDG) tutorial Many thanks to the EDG tutorials team!

More information

Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers

Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers Ntinos Krampis Asst. Professor J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/

More information

University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology

University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology Programme Structure - the MSc outcome will require 180 credits total (full-time only) - 60

More information

Alison Yao, Ph.D. July 2014

Alison Yao, Ph.D. July 2014 * Alison Yao, Ph.D. Program Officer, Office of Genomics and Advanced Technologies Division of Microbiology and Infectious Diseases National Institute of Allergy and Infectious Diseases National Institutes

More information

Applications des grilles aux sciences du vivant

Applications des grilles aux sciences du vivant Institut des Grilles du CNRS Applications des grilles aux sciences du vivant V. Breton Credit: A. Da Costa, P. De Vlieger, J. Salzemann Introduction Grid technology provides services to do science differently,

More information

Cloud Ready for Bioinformatics?

Cloud Ready for Bioinformatics? IDB acknowledges co-funding by the European Community's Seventh Framework Programme (INFSO-RI-261552) and the French National Research Agency's Arpege Programme (ANR-10-SEGI-001) Cloud Ready for Bioinformatics?

More information

An Introduction to Genomics and SAS Scientific Discovery Solutions

An Introduction to Genomics and SAS Scientific Discovery Solutions An Introduction to Genomics and SAS Scientific Discovery Solutions Dr Karen M Miller Product Manager Bioinformatics SAS EMEA 16.06.03 Copyright 2003, SAS Institute Inc. All rights reserved. 1 Overview!

More information

EDG Project: Database Management Services

EDG Project: Database Management Services EDG Project: Database Management Services Leanne Guy for the EDG Data Management Work Package EDG::WP2 Leanne.Guy@cern.ch http://cern.ch/leanne 17 April 2002 DAI Workshop Presentation 1 Information in

More information

The GENIUS Grid Portal

The GENIUS Grid Portal The GENIUS Grid Portal (*) work in collaboration with A. Falzone and A. Rodolico EGEE NA4 Workshop, Paris, 18.12.2003 CHEP 2000, 10.02.2000 Outline Introduction Grid portal architecture and requirements

More information

Guidelines for Establishment of Contract Areas Computer Science Department

Guidelines for Establishment of Contract Areas Computer Science Department Guidelines for Establishment of Contract Areas Computer Science Department Current 07/01/07 Statement: The Contract Area is designed to allow a student, in cooperation with a member of the Computer Science

More information

An introduction to bioinformatic tools for population genomic and metagenetic data analysis, 2.5 higher education credits Third Cycle

An introduction to bioinformatic tools for population genomic and metagenetic data analysis, 2.5 higher education credits Third Cycle An introduction to bioinformatic tools for population genomic and metagenetic data analysis, 2.5 higher education credits Third Cycle Faculty of Science; Department of Marine Sciences The Swedish Royal

More information

Towards a Big Data Taxonomy. Bill Mandrick, PhD Data Tactics Version 26_August_2013

Towards a Big Data Taxonomy. Bill Mandrick, PhD Data Tactics Version 26_August_2013 Towards a Big Data Taxonomy Bill Mandrick, PhD Data Tactics Version 26_August_2013 Scientific Taxonomies Represent Types of Processes Types of Objects Physical Objects Information Artifacts Types of Characteristics

More information

Practical Solutions for Big Data Analytics

Practical Solutions for Big Data Analytics Practical Solutions for Big Data Analytics Ravi Madduri Computation Institute (madduri@anl.gov) Paul Dave (pdave@uchicago.edu) Dinanath Sulakhe (sulakhe@uchicago.edu) Alex Rodriguez (arodri7@uchicago.edu)

More information

Overview sequence projects

Overview sequence projects Overview sequence projects Bioassist NGS meeting 15-01-2010 Barbera van Schaik KEBB - Bioinformatics Laboratory b.d.vanschaik@amc.uva.nl NGS at the Academic Medical Center Sequence facility Laboratory

More information

582606 Introduction to bioinformatics

582606 Introduction to bioinformatics 582606 Introduction to bioinformatics Autumn 2007 Esa Pitkänen Master's Degree Programme in Bioinformatics (MBI) Department of Computer Science, University of Helsinki http://www.cs.helsinki.fi/mbi/courses/07-08/itb/

More information

REGULATIONS FOR THE DEGREE OF BACHELOR OF SCIENCE IN BIOINFORMATICS (BSc[BioInf])

REGULATIONS FOR THE DEGREE OF BACHELOR OF SCIENCE IN BIOINFORMATICS (BSc[BioInf]) 820 REGULATIONS FOR THE DEGREE OF BACHELOR OF SCIENCE IN BIOINFORMATICS (BSc[BioInf]) (See also General Regulations) BMS1 Admission to the Degree To be eligible for admission to the degree of Bachelor

More information

BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology http://tinyurl.com/bioinf525-w16

BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology http://tinyurl.com/bioinf525-w16 Course Director: Dr. Barry Grant (DCM&B, bjgrant@med.umich.edu) Description: This is a three module course covering (1) Foundations of Bioinformatics, (2) Statistics in Bioinformatics, and (3) Systems

More information

The Galaxy workflow. George Magklaras PhD RHCE

The Galaxy workflow. George Magklaras PhD RHCE The Galaxy workflow George Magklaras PhD RHCE Biotechnology Center of Oslo & The Norwegian Center of Molecular Medicine University of Oslo, Norway http://www.biotek.uio.no http://www.ncmm.uio.no http://www.no.embnet.org

More information

Network monitoring in DataGRID project

Network monitoring in DataGRID project Network monitoring in DataGRID project Franck Bonnassieux (CNRS) franck.bonnassieux@ens-lyon.fr 1st SCAMPI Workshop 27 Jan. 2003 DataGRID Network Monitoring Outline DataGRID network Specificity of Grid

More information

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Ntinos Krampis Asst. Professor J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/

More information

A Multiple DNA Sequence Translation Tool Incorporating Web Robot and Intelligent Recommendation Techniques

A Multiple DNA Sequence Translation Tool Incorporating Web Robot and Intelligent Recommendation Techniques Proceedings of the 2007 WSEAS International Conference on Computer Engineering and Applications, Gold Coast, Australia, January 17-19, 2007 402 A Multiple DNA Sequence Translation Tool Incorporating Web

More information

OpenCB a next generation big data analytics and visualisation platform for the Omics revolution

OpenCB a next generation big data analytics and visualisation platform for the Omics revolution OpenCB a next generation big data analytics and visualisation platform for the Omics revolution Development at the University of Cambridge - Closing the Omics / Moore s law gap with Dell & Intel Ignacio

More information

Usability in bioinformatics mobile applications

Usability in bioinformatics mobile applications Usability in bioinformatics mobile applications what we are working on Noura Chelbah, Sergio Díaz, Óscar Torreño, and myself Juan Falgueras App name Performs Advantajes Dissatvantajes Link The problem

More information

Fédération et analyse de données distribuées en imagerie biomédicale

Fédération et analyse de données distribuées en imagerie biomédicale Software technologies for integration of processes and data in neurosciences ConnaissancEs Distribuées en Imagerie BiomédicaLE Fédération et analyse de données distribuées en imagerie biomédicale Johan

More information

Managing and Conducting Biomedical Research on the Cloud Prasad Patil

Managing and Conducting Biomedical Research on the Cloud Prasad Patil Managing and Conducting Biomedical Research on the Cloud Prasad Patil Laboratory for Personalized Medicine Center for Biomedical Informatics Harvard Medical School SaaS & PaaS gmail google docs app engine

More information

A W orkflow Management System for Bioinformatics Grid

A W orkflow Management System for Bioinformatics Grid A W orkflow Management System for Bioinformatics Grid Giovanni Aloisio, Massimo Cafaro, Sandro Fiore, Maria Mirto C A C T/IS U FI SP A CI, University of Lecce and NNL/INFM&CNR,Italy NETTAB 2005, 5-7 October

More information

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Ntinos Krampis Asst. Professor J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/

More information

Recent grid activities at INFN Catania(*) Roberto Barbera

Recent grid activities at INFN Catania(*) Roberto Barbera RecentgridactivitiesatINFNCatania(*) RobertoBarbera workincollaborationwithnicesrl (*) HEPiX/HEPNT2002,Catania,18.04.2002 CHEP2000,10.02.2000 1 RobertoBarbera DipartimentodiFisicadell UniversitàdiCataniaandINFNCatania

More information

An introduction to bioinformatic tools for metagenetic and population genomic data analysis, 2.0 higher education credits

An introduction to bioinformatic tools for metagenetic and population genomic data analysis, 2.0 higher education credits An introduction to bioinformatic tools for metagenetic and population genomic data analysis, 2.0 higher education credits Course period: 3-7 November 2014 Course leaders / Addresses for applications: Pierre

More information

SAP HANA Enabling Genome Analysis

SAP HANA Enabling Genome Analysis SAP HANA Enabling Genome Analysis Joanna L. Kelley, PhD Postdoctoral Scholar, Stanford University Enakshi Singh, MSc HANA Product Management, SAP Labs LLC Outline Use cases Genomics review Challenges in

More information

Data Management System for grid and portal services

Data Management System for grid and portal services Data Management System for grid and portal services Piotr Grzybowski 1, Cezary Mazurek 1, Paweł Spychała 1, Marcin Wolski 1 1 Poznan Supercomputing and Networking Center, ul. Noskowskiego 10, 61-704 Poznan,

More information

Visual Mining for Big Data

Visual Mining for Big Data Visual Mining for Big Data Big Dive June 21st, 2013 Alessandro Piglia Kairos3D Where do we come from? Kairos3D comes from real-time 3D graphics Serious Games (virtual visits, training for industry operators,

More information

BIO 3352: BIOINFORMATICS II HYBRID COURSE SYLLABUS

BIO 3352: BIOINFORMATICS II HYBRID COURSE SYLLABUS BIO 3352: BIOINFORMATICS II HYBRID COURSE SYLLABUS NEW YORK CITY COLLEGE OF TECHNOLOGY The City University Of New York School of Arts and Sciences Biological Sciences Department Course title: Bioinformatics

More information

Case Study Life Sciences Data

Case Study Life Sciences Data Case Study Life Sciences Data Centre for Integrative Systems Biology and Bioinformatics www.imperial.ac.uk/bioinfsupport Sarah Butcher s.butcher@imperial.ac.uk www.imperial.ac.uk/bioinfsupport Bio-data

More information

Processing Genome Data using Scalable Database Technology. My Background

Processing Genome Data using Scalable Database Technology. My Background Johann Christoph Freytag, Ph.D. freytag@dbis.informatik.hu-berlin.de http://www.dbis.informatik.hu-berlin.de Stanford University, February 2004 PhD @ Harvard Univ. Visiting Scientist, Microsoft Res. (2002)

More information

Search and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov

Search and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov Search and Data Mining: Techniques Applications Anya Yarygina Boris Novikov Introduction Data mining applications Data mining system products and research prototypes Additional themes on data mining Social

More information

CD-HIT User s Guide. Last updated: April 5, 2010. http://cd-hit.org http://bioinformatics.org/cd-hit/

CD-HIT User s Guide. Last updated: April 5, 2010. http://cd-hit.org http://bioinformatics.org/cd-hit/ CD-HIT User s Guide Last updated: April 5, 2010 http://cd-hit.org http://bioinformatics.org/cd-hit/ Program developed by Weizhong Li s lab at UCSD http://weizhong-lab.ucsd.edu liwz@sdsc.edu 1. Introduction

More information

MediSapiens Ltd. Bio-IT solutions for improving cancer patient care. Because data is not knowledge. 19th of March 2015

MediSapiens Ltd. Bio-IT solutions for improving cancer patient care. Because data is not knowledge. 19th of March 2015 19th of March 2015 MediSapiens Ltd Because data is not knowledge Bio-IT solutions for improving cancer patient care Sami Kilpinen, Ph.D Co-founder, CEO MediSapiens Ltd Copyright 2015 MediSapiens Ltd. All

More information

Globus Genomics Tutorial GlobusWorld 2014

Globus Genomics Tutorial GlobusWorld 2014 Globus Genomics Tutorial GlobusWorld 2014 Agenda Overview of Globus Genomics Example Collaborations Demonstration Globus Genomics interface Globus Online integration Scenario 1: Using Globus Genomics for

More information

Innovation in the LIS: Implications for Design, Procurement and Management

Innovation in the LIS: Implications for Design, Procurement and Management Innovation in the LIS: Implications for Design, Procurement and Management Ulysses J. Balis, M.D. Director, Division of Pathology Informatics & Director, Pathology Informatics Fellowship Program Department

More information

Similarity Searches on Sequence Databases: BLAST, FASTA. Lorenza Bordoli Swiss Institute of Bioinformatics EMBnet Course, Basel, October 2003

Similarity Searches on Sequence Databases: BLAST, FASTA. Lorenza Bordoli Swiss Institute of Bioinformatics EMBnet Course, Basel, October 2003 Similarity Searches on Sequence Databases: BLAST, FASTA Lorenza Bordoli Swiss Institute of Bioinformatics EMBnet Course, Basel, October 2003 Outline Importance of Similarity Heuristic Sequence Alignment:

More information

Doctor of Philosophy in Computer Science

Doctor of Philosophy in Computer Science Doctor of Philosophy in Computer Science Background/Rationale The program aims to develop computer scientists who are armed with methods, tools and techniques from both theoretical and systems aspects

More information

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources 1 of 8 11/7/2004 11:00 AM National Center for Biotechnology Information About NCBI NCBI at a Glance A Science Primer Human Genome Resources Model Organisms Guide Outreach and Education Databases and Tools

More information

Large-scale Research Data Management and Analysis Using Globus Services. Ravi Madduri Argonne National Lab University of Chicago @madduri

Large-scale Research Data Management and Analysis Using Globus Services. Ravi Madduri Argonne National Lab University of Chicago @madduri Large-scale Research Data Management and Analysis Using Globus Services Ravi Madduri Argonne National Lab University of Chicago @madduri Outline Who we are Challenges in Big Data Management and Analysis

More information

Biological Databases and Protein Sequence Analysis

Biological Databases and Protein Sequence Analysis Biological Databases and Protein Sequence Analysis Introduction M. Madan Babu, Center for Biotechnology, Anna University, Chennai 25, India Bioinformatics is the application of Information technology to

More information

Web-Based Genomic Information Integration with Gene Ontology

Web-Based Genomic Information Integration with Gene Ontology Web-Based Genomic Information Integration with Gene Ontology Kai Xu 1 IMAGEN group, National ICT Australia, Sydney, Australia, kai.xu@nicta.com.au Abstract. Despite the dramatic growth of online genomic

More information

Acceleration for Personalized Medicine Big Data Applications

Acceleration for Personalized Medicine Big Data Applications Acceleration for Personalized Medicine Big Data Applications Zaid Al-Ars Computer Engineering (CE) Lab Delft Data Science Delft University of Technology 1" Introduction Definition & relevance Personalized

More information

Software review. Pise: Software for building bioinformatics webs

Software review. Pise: Software for building bioinformatics webs Pise: Software for building bioinformatics webs Keywords: bioinformatics web, Perl, sequence analysis, interface builder Abstract Pise is interface construction software for bioinformatics applications

More information

HPC Cloud. Focus on your research. Floris Sluiter Project leader SARA

HPC Cloud. Focus on your research. Floris Sluiter Project leader SARA HPC Cloud Focus on your research Floris Sluiter Project leader SARA Why an HPC Cloud? Christophe Blanchet, IDB - Infrastructure Distributing Biology: Big task to port them all to your favorite architecture

More information

Scientific and Technical Applications as a Service in the Cloud

Scientific and Technical Applications as a Service in the Cloud Scientific and Technical Applications as a Service in the Cloud University of Bern, 28.11.2011 adapted version Wibke Sudholt CloudBroker GmbH Technoparkstrasse 1, CH-8005 Zurich, Switzerland Phone: +41

More information

EMBL Identity & Access Management

EMBL Identity & Access Management EMBL Identity & Access Management Rupert Lück EMBL Heidelberg e IRG Workshop Zürich Apr 24th 2008 Outline EMBL Overview Identity & Access Management for EMBL IT Requirements & Strategy Project Goal and

More information

Teaching Bioinformatics to Undergraduates

Teaching Bioinformatics to Undergraduates Teaching Bioinformatics to Undergraduates http://www.med.nyu.edu/rcr/asm Stuart M. Brown Research Computing, NYU School of Medicine I. What is Bioinformatics? II. Challenges of teaching bioinformatics

More information

The University is comprised of seven colleges and offers 19. including more than 5000 graduate students.

The University is comprised of seven colleges and offers 19. including more than 5000 graduate students. UNC CHARLOTTE A doctoral, research-intensive university, UNC Charlotte is the largest institution of higher education in the Charlotte region. The University is comprised of seven colleges and offers 19

More information

glite Job Management

glite Job Management glite Job Management Gilberto Diaz Riccardo Di Meo Job Submission Egee submission Resembles very closely the batch submission in a cluster. Can select resources automatically ( Grid like submission). Resources

More information

Course Specification

Course Specification 1 Course Specification Program on which the course is given: Department offering the program: Department offering the course: Academic year /level: Date of specification approval: 2008/2009 Masters Degree

More information

Final Project Report

Final Project Report CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes

More information

REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc])

REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc]) 305 REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc]) (See also General Regulations) Any publication based on work approved for a higher degree should contain a reference

More information

Pipeline Pilot Enterprise Server. Flexible Integration of Disparate Data and Applications. Capture and Deployment of Best Practices

Pipeline Pilot Enterprise Server. Flexible Integration of Disparate Data and Applications. Capture and Deployment of Best Practices overview Pipeline Pilot Enterprise Server Pipeline Pilot Enterprise Server (PPES) is a powerful client-server platform that streamlines the integration and analysis of the vast quantities of data flooding

More information

Fast. Integrated Genome Browser & DAS. Easy. Flexible. Free. bioviz.org/igb

Fast. Integrated Genome Browser & DAS. Easy. Flexible. Free. bioviz.org/igb bioviz.org/igb Integrated Genome Browser & DAS Free tools for visualizing, sharing, and publishing genomes and genome-scale data. Easy Flexible Fast Free Funding: National Science Foundation Arabidopsis

More information

A Professional Big Data Master s Program to train Computational Specialists

A Professional Big Data Master s Program to train Computational Specialists A Professional Big Data Master s Program to train Computational Specialists Anoop Sarkar, Fred Popowich, Alexandra Fedorova! School of Computing Science! Education for Employable Graduates: Critical Questions

More information

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova Using the Grid for the interactive workflow management in biomedicine Andrea Schenone BIOLAB DIST University of Genova overview background requirements solution case study results background A multilevel

More information

The EcoCyc Curation Process

The EcoCyc Curation Process The EcoCyc Curation Process Ingrid M. Keseler SRI International 1 HOW OFTEN IS THE GOLDEN GATE BRIDGE PAINTED? Many misconceptions exist about how often the Bridge is painted. Some say once every seven

More information

Bioinformatics Resources at a Glance

Bioinformatics Resources at a Glance Bioinformatics Resources at a Glance A Note about FASTA Format There are MANY free bioinformatics tools available online. Bioinformaticists have developed a standard format for nucleotide and protein sequences

More information

Illinois Mathematics and Science Academy

Illinois Mathematics and Science Academy Illinois Mathematics and Science Academy A Pioneering Educational Community Name Ms. O Leary-Driscoll Phone 5363 e-mail soleary@imsa.edu Comprehensive Course Syllabus for Bioinformatics Fall 2015 Instructors

More information

The LSST Data management and French computing activities. Dominique Fouchez on behalf of the IN2P3 Computing Team. LSST France April 8th,2015

The LSST Data management and French computing activities. Dominique Fouchez on behalf of the IN2P3 Computing Team. LSST France April 8th,2015 The LSST Data management and French computing activities Dominique Fouchez on behalf of the IN2P3 Computing Team LSST France April 8th,2015 OSG All Hands SLAC April 7-9, 2014 1 The LSST Data management

More information

Cloud-Based Big Data Analytics in Bioinformatics

Cloud-Based Big Data Analytics in Bioinformatics Cloud-Based Big Data Analytics in Bioinformatics Presented By Cephas Mawere Harare Institute of Technology, Zimbabwe 1 Introduction 2 Big Data Analytics Big Data are a collection of data sets so large

More information

COMPUTER SCIENCE: MISCONCEPTIONS, CAREER PATHS AND RESEARCH CHALLENGES

COMPUTER SCIENCE: MISCONCEPTIONS, CAREER PATHS AND RESEARCH CHALLENGES COMPUTER SCIENCE: MISCONCEPTIONS, CAREER PATHS AND RESEARCH CHALLENGES School of Computing and Information Sciences Florida International University Slides Prepared by: Vagelis Hristidis (CS Assistant

More information

Software and Methods for the Analysis of Affymetrix GeneChip Data. Rafael A Irizarry Department of Biostatistics Johns Hopkins University

Software and Methods for the Analysis of Affymetrix GeneChip Data. Rafael A Irizarry Department of Biostatistics Johns Hopkins University Software and Methods for the Analysis of Affymetrix GeneChip Data Rafael A Irizarry Department of Biostatistics Johns Hopkins University Outline Overview Bioconductor Project Examples 1: Gene Annotation

More information

Master's projects at ITMO University. Daniil Chivilikhin PhD Student @ ITMO University

Master's projects at ITMO University. Daniil Chivilikhin PhD Student @ ITMO University Master's projects at ITMO University Daniil Chivilikhin PhD Student @ ITMO University General information Guidance from our lab's researchers Publishable results 2 Research areas Research at ITMO Evolutionary

More information

Importance of Statistics in creating high dimensional data

Importance of Statistics in creating high dimensional data Importance of Statistics in creating high dimensional data Hemant K. Tiwari, PhD Section on Statistical Genetics Department of Biostatistics University of Alabama at Birmingham History of Genomic Data

More information

GRADUATE CATALOG LISTING

GRADUATE CATALOG LISTING GRADUATE CATALOG LISTING 1 BIOINFORMATICS & COMPUTATIONAL BIOLOGY Telephone: (302) 831-0161 http://bioinformatics.udel.edu/education Faculty Listing: http://bioinformatics.udel.edu/education/faculty A.

More information

IEEE International Conference on Computing, Analytics and Security Trends CAST-2016 (19 21 December, 2016) Call for Paper

IEEE International Conference on Computing, Analytics and Security Trends CAST-2016 (19 21 December, 2016) Call for Paper IEEE International Conference on Computing, Analytics and Security Trends CAST-2016 (19 21 December, 2016) Call for Paper CAST-2015 provides an opportunity for researchers, academicians, scientists and

More information

Globus Research Data Management: Introduction and Service Overview. Steve Tuecke Vas Vasiliadis

Globus Research Data Management: Introduction and Service Overview. Steve Tuecke Vas Vasiliadis Globus Research Data Management: Introduction and Service Overview Steve Tuecke Vas Vasiliadis Presentations and other useful information available at globus.org/events/xsede15/tutorial 2 Thank you to

More information

Big Data Mining Services and Knowledge Discovery Applications on Clouds

Big Data Mining Services and Knowledge Discovery Applications on Clouds Big Data Mining Services and Knowledge Discovery Applications on Clouds Domenico Talia DIMES, Università della Calabria & DtoK Lab Italy talia@dimes.unical.it Data Availability or Data Deluge? Some decades

More information

Software, Computing and Analysis Models at CDF and D0

Software, Computing and Analysis Models at CDF and D0 Software, Computing and Analysis Models at CDF and D0 Donatella Lucchesi CDF experiment INFN-Padova Outline Introduction CDF and D0 Computing Model GRID Migration Summary III Workshop Italiano sulla fisica

More information

Amazon Elastic MapReduce. Jinesh Varia Peter Sirota Richard Cole

Amazon Elastic MapReduce. Jinesh Varia Peter Sirota Richard Cole Amazon Elastic MapReduce Jinesh Varia Peter Sirota Richard Cole Start End From IDE Command line Web Console Notify Input Data Get Results Start End From IDE Command line Web Console AWS EC2 Instance Notify

More information

MIGRATING DESKTOP AND ROAMING ACCESS. Migrating Desktop and Roaming Access Whitepaper

MIGRATING DESKTOP AND ROAMING ACCESS. Migrating Desktop and Roaming Access Whitepaper Migrating Desktop and Roaming Access Whitepaper Poznan Supercomputing and Networking Center Noskowskiego 12/14 61-704 Poznan, POLAND 2004, April white-paper-md-ras.doc 1/11 1 Product overview In this whitepaper

More information

EMBL-EBI Web Services

EMBL-EBI Web Services EMBL-EBI Web Services Rodrigo Lopez Head of the External Services Team SME Workshop Piemonte 2011 EBI is an Outstation of the European Molecular Biology Laboratory. Summary Introduction The JDispatcher

More information

Big Data Visualization for Genomics. Luca Vezzadini Kairos3D

Big Data Visualization for Genomics. Luca Vezzadini Kairos3D Big Data Visualization for Genomics Luca Vezzadini Kairos3D Why GenomeCruzer? The amount of data for DNA sequencing is growing Modern hardware produces billions of values per sample Scientists need to

More information

Bio-Informatics Lectures. A Short Introduction

Bio-Informatics Lectures. A Short Introduction Bio-Informatics Lectures A Short Introduction The History of Bioinformatics Sanger Sequencing PCR in presence of fluorescent, chain-terminating dideoxynucleotides Massively Parallel Sequencing Massively

More information

Abdullah Mohammed Abdullah Khamis

Abdullah Mohammed Abdullah Khamis Abdullah Mohammed Abdullah Khamis Jeddah, Saudi Arabia Email: Abdullahkhamis@gmail.com Mobile: +966 567243182 Tel: +966 2 6340699 (Yemeni) Research and Professional Objective To Complete my Ph.D. in Pattern

More information

A Platform for Collaborative e-science Applications. Marian Bubak ICS / Cyfronet AGH Krakow, PL bubak@agh.edu.pl

A Platform for Collaborative e-science Applications. Marian Bubak ICS / Cyfronet AGH Krakow, PL bubak@agh.edu.pl A Platform for Collaborative e-science Applications Marian Bubak ICS / Cyfronet AGH Krakow, PL bubak@agh.edu.pl Outline Motivation Idea of an experiment Virtual laboratory Examples of experiments Summary

More information

Development of Bio-Cloud Service for Genomic Analysis Based on Virtual

Development of Bio-Cloud Service for Genomic Analysis Based on Virtual Development of Bio-Cloud Service for Genomic Analysis Based on Virtual Infrastructure 1 Jung-Ho Um, 2 Sang Bae Park, 3 Hoon Choi, 4 Hanmin Jung 1, First Author Korea Institute of Science and Technology

More information

Local Alignment Tool Based on Hadoop Framework and GPU Architecture

Local Alignment Tool Based on Hadoop Framework and GPU Architecture Local Alignment Tool Based on Hadoop Framework and GPU Architecture Che-Lun Hung * Department of Computer Science and Communication Engineering Providence University Taichung, Taiwan clhung@pu.edu.tw *

More information