Metagenomic Assembly
|
|
|
- Elinor Bruce
- 10 years ago
- Views:
Transcription
1 Metagenomic Assembly Successes, Validation, And Challenges Matthew Scholz Los Alamos National Laboratory Genome Sciences Division
2 Metagenomic Assembly is Complex Complexity dictates method of assembly Simple communities assemble easily Complex communities.do not Complexity dictates sequencing requirement More sequence means more complexity and more difficulty
3 Assembly Successes High throughput pipeline Improved assemblies JGI/LANL has successfully assembled 123 metagenomes in last year Average time ~ 1 week/sample HMP metagenome assemblies LANL assembled 223 metagenomes from whole genome shotgun sequencing of HMP Assembly of 10 site specific samples (multiple samples from same site) Validation of several metagenome assemblies
4 Assembly differences HMP shotgun metagenome assembly Optimization Tool Selection Kmer Selection Selection of # Cores Volume production 1 Kmer Metrics JGI Metagenome Assembly Multiple tool selection Range of Kmers utilized Many different sample types
5 Draft Many Samples Many metrics Max contig Total Assembly size Etc. HMP assemblies How to interpret metrics?
6 JGI/LANL Metagenome Assembly Pipeline
7 JGI/LANL Assembly Process Improved merging of Multiple assemblies Statistical Metrics are better Validation supports these improved assemblies
8 Validation How do you validate metagenomes? Contig Statistics Read Mapping Annotation Similarity Searches Phylogenetic distribution Reference genomes
9 Read Mapping Are contigs correctly assembled? Do read data confirm contigs? Edge effects Generate information about coverage Average Fold Coverage Percent of Contig Covered SNP/INDEL information Indication of diversity within sample
10 Finding Potentially Erroneous Contigs with Read mapping Read coverage of each base from pipeline. Lines delineate regions where mean coverage deviates past thresholds Is this a good contig? Where did contig come from? Can use to automatically break or discard contigs that fail read-mapping
11 Reference Genomes Map reads to reference genomes Determine which/how many sequenced genomes present in sample Allow partitioning of reads SNP/INDEL analysis Post-Assembly Map contigs to genomes SNP/INDEL Validation
12 Ongoing Challenges Hardware/Software Kmer assembly Speed/RAM tradeoff Algorithms for metagenomes Too much software How to Determine Correct assembly? Metadata Too much data Not enough coverage
13 Read Incorporation % of Reads Incorporated E E E E E+08 # of Reads in Sample Biofilm Compost Sediment Soil Water
14 Not Enough Coverage Coverage varies by sample 1 Lane HiSeq is maximum for current hardware/assemblers (complex samples) ~6000X coverage ~ 70X coverage ~600X coverage ~28X coverage >95% of Reads <10% of Reads
15 Concluding Thoughts We need metadata (standards) Tiers 1. Assembly Statistics 2. Assembly Level 3. Contig analysis Scores
16 Classification of Metagenomes 1. Assembly Statistics % Read Incorporation Total Assembled Bases (post filtering?) G+C Content Coverage histograms 2. Assembly Level Draft 1 assembler, 1 set of parameters Improved Draft Current JGI/LANL assembly/merge method Read based validation High Quality (Theoretical) Binning strategies pre-assembly Read based correction/trimming Single cell genomes from site as references
17 Classification of Metagenomes 1. Assembly Statistics 2. Assembly Level 3. Contig analysis Read mapping based validation Clustering Gene analysis
18 Many Thanks: Metagenomics and Data Analysis Team Patrick Chain Tracey Freitas Ron Croonenberg Bin Hu Chien-Chi Lo Shawn Starkenburg Gary Xie Shannon Steinfadt Others Metagenome work Jim Tiedje Titus Brown Adina Howe HMP consortium Mihai Pop Joe Zhou Kostas Konstantinidis Informatics Team Ben Allen Andy Seirp Criag Blackhart Yan Xu Todd Yilk Single cell work Roger Lasken Ramunas Stepanaskus Steve Hallam Management Team Chris Detter David Bruce Tracy Erkkila Lance Green Shunsheng Han And many others Wet-lab Team Cheryl Gleasner Kim McMurry Krista Reitenga Xiaohong Shen Others Project Management Shannon Johnson Lynne Goodwin Others Kmer team Joel Berendzen Nick Hengartner Ben McMahon Judith Cohn Finishing and SCG Olga Chertov Karen Davenport Armand Dichosa Michael Fitzsimons Ahmet Zeytun Others
19 Metagenomic Assembly Strategies Bigger computers 1 Lane HiSeq PE reads = 400 M reads 1TB RAM (complex communities) Better Assemblers MetaIDBA MetaVelvet Ray ABySS AllPaths Binning Reads Unsupervised/Heuristic Machine Learning Statistical Reference Based To Infinity and Beyond
Shouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center
Computational Challenges in Storage, Analysis and Interpretation of Next-Generation Sequencing Data Shouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center Next Generation Sequencing
Master's projects at ITMO University. Daniil Chivilikhin PhD Student @ ITMO University
Master's projects at ITMO University Daniil Chivilikhin PhD Student @ ITMO University General information Guidance from our lab's researchers Publishable results 2 Research areas Research at ITMO Evolutionary
Building Bioinformatics Capacity in Africa. Nicky Mulder CBIO Group, UCT
Building Bioinformatics Capacity in Africa Nicky Mulder CBIO Group, UCT Outline What is bioinformatics? Why do we need IT infrastructure? What e-infrastructure does it require? How we are developing this
CHALLENGES IN NEXT-GENERATION SEQUENCING
CHALLENGES IN NEXT-GENERATION SEQUENCING BASIC TENETS OF DATA AND HPC Gray s Laws of data engineering 1 : Scientific computing is very dataintensive, with no real limits. The solution is scale-out architecture
De Novo Assembly Using Illumina Reads
De Novo Assembly Using Illumina Reads High quality de novo sequence assembly using Illumina Genome Analyzer reads is possible today using publicly available short-read assemblers. Here we summarize the
Tutorial for Windows and Macintosh. Preparing Your Data for NGS Alignment
Tutorial for Windows and Macintosh Preparing Your Data for NGS Alignment 2015 Gene Codes Corporation Gene Codes Corporation 775 Technology Drive, Ann Arbor, MI 48108 USA 1.800.497.4939 (USA) 1.734.769.7249
MK Gibson et al. Training and Benchmarking of Resfams profile HMMs
Training and Benchmarking of Resfams profile HMMs A curated list of antibiotic resistance (AR) proteins used to generate Resfams profile HMMs, along with associated reference gene sequences, is available
Introduction to NGS data analysis
Introduction to NGS data analysis Jeroen F. J. Laros Leiden Genome Technology Center Department of Human Genetics Center for Human and Clinical Genetics Sequencing Illumina platforms Characteristics: High
Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community
Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Ntinos Krampis Asst. Professor J. Craig Venter Institute [email protected] http://www.jcvi.org/cms/about/bios/kkrampis/
Genomic Applications on Cray supercomputers: Next Generation Sequencing Workflow. Barry Bolding. Cray Inc Seattle, WA
Genomic Applications on Cray supercomputers: Next Generation Sequencing Workflow Barry Bolding Cray Inc Seattle, WA 1 CUG 2013 Paper Genomic Applications on Cray supercomputers: Next Generation Sequencing
Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community
Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Ntinos Krampis Asst. Professor J. Craig Venter Institute [email protected] http://www.jcvi.org/cms/about/bios/kkrampis/
MiSeq: Imaging and Base Calling
MiSeq: Imaging and Page Welcome Navigation Presenter Introduction MiSeq Sequencing Workflow Narration Welcome to MiSeq: Imaging and. This course takes 35 minutes to complete. Click Next to continue. Please
How Sequencing Experiments Fail
How Sequencing Experiments Fail v1.0 Simon Andrews [email protected] Classes of Failure Technical Tracking Library Contamination Biological Interpretation Something went wrong with a machine
From Data to Insight: Big Data and Analytics for Smart Manufacturing Systems
From Data to Insight: Big Data and Analytics for Smart Manufacturing Systems Dr. Sudarsan Rachuri Program Manager Smart Manufacturing Systems Design and Analysis Systems Integration Division Engineering
DNA Sequencing Data Compression. Michael Chung
DNA Sequencing Data Compression Michael Chung Problem DNA sequencing per dollar is increasing faster than storage capacity per dollar. Stein (2010) Data 3 billion base pairs in human genome Genomes are
Standards, Guidelines and Best Practices for RNA-Seq V1.0 (June 2011) The ENCODE Consortium
Standards, Guidelines and Best Practices for RNA-Seq V1.0 (June 2011) The ENCODE Consortium I. Introduction: Sequence based assays of transcriptomes (RNA-seq) are in wide use because of their favorable
Overview of Next Generation Sequencing platform technologies
Overview of Next Generation Sequencing platform technologies Dr. Bernd Timmermann Next Generation Sequencing Core Facility Max Planck Institute for Molecular Genetics Berlin, Germany Outline 1. Technologies
Typing in the NGS era: The way forward!
Typing in the NGS era: The way forward! Valeria Michelacci NGS course, June 2015 Typing from sequence data NGS-derived conventional Multi Locus Sequence Typing (University of Warwick, 7 housekeeping genes)
Data Processing of Nextera Mate Pair Reads on Illumina Sequencing Platforms
Data Processing of Nextera Mate Pair Reads on Illumina Sequencing Platforms Introduction Mate pair sequencing enables the generation of libraries with insert sizes in the range of several kilobases (Kb).
G E N OM I C S S E RV I C ES
GENOMICS SERVICES THE NEW YORK GENOME CENTER NYGC is an independent non-profit implementing advanced genomic research to improve diagnosis and treatment of serious diseases. capabilities. N E X T- G E
UCHIME in practice Single-region sequencing Reference database mode
UCHIME in practice Single-region sequencing UCHIME is designed for experiments that perform community sequencing of a single region such as the 16S rrna gene or fungal ITS region. While UCHIME may prove
nuts and bolts of DNA sequencing approaches and bioinformatic tools
nuts and bolts of DNA sequencing approaches and bioinformatic tools Dionysios A. Antonopoulos Institute for Genomics and Systems Biology Biosciences Division Argonne National Laboratory August 7, 2012
MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research
MapReduce and Distributed Data Analysis Google Research 1 Dealing With Massive Data 2 2 Dealing With Massive Data Polynomial Memory Sublinear RAM Sketches External Memory Property Testing 3 3 Dealing With
UF EDGE brings the classroom to you with online, worldwide course delivery!
What is the University of Florida EDGE Program? EDGE enables engineering professional, military members, and students worldwide to participate in courses, certificates, and degree programs from the UF
A Primer of Genome Science THIRD
A Primer of Genome Science THIRD EDITION GREG GIBSON-SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts USA Contents Preface xi 1 Genome Projects:
The NGS IT notes. George Magklaras PhD RHCE
The NGS IT notes George Magklaras PhD RHCE Biotechnology Center of Oslo & The Norwegian Center of Molecular Medicine University of Oslo, Norway http://www.biotek.uio.no http://www.ncmm.uio.no http://www.no.embnet.org
Integrated Rule-based Data Management System for Genome Sequencing Data
Integrated Rule-based Data Management System for Genome Sequencing Data A Research Data Management (RDM) Green Shoots Pilots Project Report by Michael Mueller, Simon Burbidge, Steven Lawlor and Jorge Ferrer
Use of Whole Genome Sequencing (WGS) of food-borne pathogens for public health protection
EFSA Scientific Colloquium n 20 Use of Whole Genome Sequencing (WGS) of food-borne pathogens for public health protection Parma, Italy, 16-17 June 2014 Why WGS based approach Infectious diseases are responsible
Keeping up with DNA technologies
Keeping up with DNA technologies Mihai Pop Department of Computer Science Center for Bioinformatics and Computational Biology University of Maryland, College Park The evolution of DNA sequencing Since
An Overview of DNA Sequencing
An Overview of DNA Sequencing Prokaryotic DNA Plasmid http://en.wikipedia.org/wiki/image:prokaryote_cell_diagram.svg Eukaryotic DNA http://en.wikipedia.org/wiki/image:plant_cell_structure_svg.svg DNA Structure
Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data
Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data The Illumina TopHat Alignment and Cufflinks Assembly and Differential Expression apps make RNA data analysis accessible to any user, regardless
GC3 Use cases for the Cloud
GC3: Grid Computing Competence Center GC3 Use cases for the Cloud Some real world examples suited for cloud systems Antonio Messina Trieste, 24.10.2013 Who am I System Architect
MAD2: A Scalable High-Throughput Exact Deduplication Approach for Network Backup Services
MAD2: A Scalable High-Throughput Exact Deduplication Approach for Network Backup Services Jiansheng Wei, Hong Jiang, Ke Zhou, Dan Feng School of Computer, Huazhong University of Science and Technology,
Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers
Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers Ntinos Krampis Asst. Professor J. Craig Venter Institute [email protected] http://www.jcvi.org/cms/about/bios/kkrampis/
UCLA Team Sequences Cell Line, Puts Open Source Software Framework into Production
Page 1 of 6 UCLA Team Sequences Cell Line, Puts Open Source Software Framework into Production February 05, 2010 Newsletter: BioInform BioInform - February 5, 2010 By Vivien Marx Scientists at the department
SMRT Analysis v2.2.0 Overview. 1. SMRT Analysis v2.2.0. 1.1 SMRT Analysis v2.2.0 Overview. Notes:
SMRT Analysis v2.2.0 Overview 100 338 400 01 1. SMRT Analysis v2.2.0 1.1 SMRT Analysis v2.2.0 Overview Welcome to Pacific Biosciences' SMRT Analysis v2.2.0 Overview 1.2 Contents This module will introduce
Protein Protein Interaction Networks
Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics
Data Registry Workshop Report
Data Registry Workshop Report Background A Joint Working Group on Data Sharing and Archiving (JWG), representing major professional societies that publish ecology, evolution, and organismal biology journals,
Microbial Oceanomics using High-Throughput DNA Sequencing
Microbial Oceanomics using High-Throughput DNA Sequencing Ramiro Logares Institute of Marine Sciences, CSIC, Barcelona 9th RES Users'Conference 23 September 2015 Importance of microbes in the sunlit ocean
A Tutorial in Genetic Sequence Classification Tools and Techniques
A Tutorial in Genetic Sequence Classification Tools and Techniques Jake Drew Data Mining CSE 8331 Southern Methodist University [email protected] www.jakemdrew.com Sequence Characters IUPAC nucleotide
Overview sequence projects
Overview sequence projects Bioassist NGS meeting 15-01-2010 Barbera van Schaik KEBB - Bioinformatics Laboratory [email protected] NGS at the Academic Medical Center Sequence facility Laboratory
MoBEDAC -- Integrated data and analysis for the indoor and built environment. Folker Meyer Argonne National Laboratory GSC 13 Shenzhen, China
MoBEDAC -- Integrated data and analysis for the indoor and built environment Folker Meyer Argonne National Laboratory GSC 13 Shenzhen, China NGS is causing paradigm shift Environmental clone libraries
Monte Carlo testing with Big Data
Monte Carlo testing with Big Data Patrick Rubin-Delanchy University of Bristol & Heilbronn Institute for Mathematical Research Joint work with: Axel Gandy (Imperial College London) with contributions from:
NGS data analysis. Bernardo J. Clavijo
NGS data analysis Bernardo J. Clavijo 1 A brief history of DNA sequencing 1953 double helix structure, Watson & Crick! 1977 rapid DNA sequencing, Sanger! 1977 first full (5k) genome bacteriophage Phi X!
Introduction to Bioinformatics 3. DNA editing and contig assembly
Introduction to Bioinformatics 3. DNA editing and contig assembly Benjamin F. Matthews United States Department of Agriculture Soybean Genomics and Improvement Laboratory Beltsville, MD 20708 [email protected]
Module 1. Sequence Formats and Retrieval. Charles Steward
The Open Door Workshop Module 1 Sequence Formats and Retrieval Charles Steward 1 Aims Acquaint you with different file formats and associated annotations. Introduce different nucleotide and protein databases.
University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology
University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology Programme Structure - the MSc outcome will require 180 credits total (full-time only) - 60
Alison Yao, Ph.D. July 2014
* Alison Yao, Ph.D. Program Officer, Office of Genomics and Advanced Technologies Division of Microbiology and Infectious Diseases National Institute of Allergy and Infectious Diseases National Institutes
Scaling up to Production
1 Scaling up to Production Overview Productionize then Scale Building Production Systems Scaling Production Systems Use Case: Scaling a Production Galaxy Instance Infrastructure Advice 2 PRODUCTIONIZE
Statistics Output Guide
Statistics Output Guide Introduction This document describes the file stats.xml, which is produced by the primary analysis pipeline. This file packages summary statistics from a single movie acquisition.
Go where the biology takes you. Genome Analyzer IIx Genome Analyzer IIe
Go where the biology takes you. Genome Analyzer IIx Genome Analyzer IIe Go where the biology takes you. To published results faster With proven scalability To the forefront of discovery To limitless applications
Alternative Deployment Models for Cloud Computing in HPC Applications. Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix
Alternative Deployment Models for Cloud Computing in HPC Applications Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix The case for Cloud in HPC Build it in house Assemble in the cloud?
Bioinformatics and its applications
Bioinformatics and its applications Alla L Lapidus, Ph.D. SPbAU, SPbSU, St. Petersburg Term Bioinformatics Term Bioinformatics was invented by Paulien Hogeweg (Полина Хогевег) and Ben Hesper in 1970 as
E. coli plasmid and gene profiling using Next Generation Sequencing
E. coli plasmid and gene profiling using Next Generation Sequencing Jeroen F. J. Laros Leiden Genome Technology Center Department of Human Genetics Center for Human and Clinical Genetics Introduction General
Genomic Testing: Actionability, Validation, and Standard of Lab Reports
Genomic Testing: Actionability, Validation, and Standard of Lab Reports emerge: Laura Rasmussen-Torvik Reaction: Heidi Rehm Summary: Dick Weinshilboum Panel: Murray Brilliant, David Carey, John Carpten,
Geospiza s Finch-Server: A Complete Data Management System for DNA Sequencing
KOO10 5/31/04 12:17 PM Page 131 10 Geospiza s Finch-Server: A Complete Data Management System for DNA Sequencing Sandra Porter, Joe Slagel, and Todd Smith Geospiza, Inc., Seattle, WA Introduction The increased
NECC History. Karl V. Steiner 2011 Annual NECC Meeting, Orono, Maine March 15, 2011
NECC History Karl V. Steiner 2011 Annual NECC Meeting, Orono, Maine March 15, 2011 EPSCoR Cyberinfrastructure Workshop First regional NENI (now NECC) Workshop held in Vermont in August 2007 Workshop heldinkentucky
Core Facility Genomics
Core Facility Genomics versatile genome or transcriptome analyses based on quantifiable highthroughput data ascertainment 1 Topics Collaboration with Harald Binder and Clemens Kreutz Project: Microarray
An Application of Machine Learning to Network Intrusion Detection
An Application of Machine Learning to Network Intrusion Detection Chris Sinclair Applied Research Laboratories The University of Texas at Austin sinclair@arlututexasedu Lyn Pierce epierce@arlututexasedu
LifeScope Genomic Analysis Software 2.5
USER GUIDE LifeScope Genomic Analysis Software 2.5 Graphical User Interface DATA ANALYSIS METHODS AND INTERPRETATION Publication Part Number 4471877 Rev. A Revision Date November 2011 For Research Use
Crowdclustering with Sparse Pairwise Labels: A Matrix Completion Approach
Outline Crowdclustering with Sparse Pairwise Labels: A Matrix Completion Approach Jinfeng Yi, Rong Jin, Anil K. Jain, Shaili Jain 2012 Presented By : KHALID ALKOBAYER Crowdsourcing and Crowdclustering
Storage Solutions for Bioinformatics
Storage Solutions for Bioinformatics Li Yan Director of FlexLab, Bioinformatics core technology laboratory [email protected] http://www.genomics.cn/flexlab/index.html Science and Technology Division,
Introduction to next-generation sequencing data
Introduction to next-generation sequencing data David Simpson Centre for Experimental Medicine Queens University Belfast http://www.qub.ac.uk/research-centres/cem/ Outline History of DNA sequencing NGS
Cloud Computing for Scientific Research
Cloud Computing for Scientific Research The NIH Nephele Project for Microbiome Analysis On behalf of: Yentram Huyen, Ph.D., Chief Nick Weber, Scientific Computing Project Manager Bioinformatics and Computational
Survey of clinical data mining applications on big data in health informatics
Survey of clinical data mining applications on big data in health informatics Matthew Herland, Taghi M. Khoshgoftaar, and Randall Wald 劉 俊 成 Survey of clinical data mining applications on big data in health
How To Cluster Of Complex Systems
Entropy based Graph Clustering: Application to Biological and Social Networks Edward C Kenley Young-Rae Cho Department of Computer Science Baylor University Complex Systems Definition Dynamically evolving
July 7th 2009 DNA sequencing
July 7th 2009 DNA sequencing Overview Sequencing technologies Sequencing strategies Sample preparation Sequencing instruments at MPI EVA 2 x 5 x ABI 3730/3730xl 454 FLX Titanium Illumina Genome Analyzer
Three data delivery cases for EMBL- EBI s Embassy. Guy Cochrane www.ebi.ac.uk
Three data delivery cases for EMBL- EBI s Embassy Guy Cochrane www.ebi.ac.uk EMBL European Bioinformatics Institute Genes, genomes & variation European Nucleotide Archive 1000 Genomes Ensembl Ensembl Genomes
NORTH PACIFIC RESEARCH BOARD SEMIANNUAL PROGRESS REPORT
1. PROJECT INFORMATION NPRB Project Number: 1303 Title: Assessing benthic meiofaunal community structure in the Alaskan Arctic: A high-throughput DNA sequencing approach Subaward period July 1, 2013 Jun
Updated CellTracker software manual
Updated CellTracker software manual Chengjin Du, Till Bretschneider The software is developed based on the former version of CellTracker (http://dbkgroup.org/celltracker/). All the menu and functions of
Databases and platforms for data analysis from NGS of MTB
Databases and platforms for data analysis from NGS of MTB Derrick Crook MMM Consortium MMM Consortium Linking Clinical record systems and NHS databases Translating next generation sequencing for patient
Deep Sequencing Data Analysis
Deep Sequencing Data Analysis Ross Whetten Professor Forestry & Environmental Resources Background Who am I, and why am I teaching this topic? I am not an expert in bioinformatics I started as a biologist
HEALTH DATA LEADERSHIP SUMMIT
HEALTH DATA LEADERSHIP SUMMIT Health Data Consortium The Offices of Butler Snow, Nashville, Tennessee January 27, 2015 Health Data Leadership Summit 2015 Members of the health care data community, including
4.2.1. What is a contig? 4.2.2. What are the contig assembly programs?
Table of Contents 4.1. DNA Sequencing 4.1.1. Trace Viewer in GCG SeqLab Table. Box. Select the editor mode in the SeqLab main window. Import sequencer trace files from the File menu. Select the trace files
Tutorial for proteome data analysis using the Perseus software platform
Tutorial for proteome data analysis using the Perseus software platform Laboratory of Mass Spectrometry, LNBio, CNPEM Tutorial version 1.0, January 2014. Note: This tutorial was written based on the information
Resource Models: Batch Scheduling
Resource Models: Batch Scheduling Last Time» Cycle Stealing Resource Model Large Reach, Mass Heterogeneity, complex resource behavior Asynchronous Revocation, independent, idempotent tasks» Resource Sharing
Genome Analysis in a Dynamically Scaled Hybrid Cloud
Genome Analysis in a Dynamically Scaled Hybrid Cloud Chris Smowton*, Georgiana Copil**, Hong- Linh Truong**, Crispin Miller* and Wei Xing* * CRUK Manchester ** TU Vienna In a Nutshell Users want to run
KNIME TUTORIAL. Anna Monreale KDD-Lab, University of Pisa Email: [email protected]
KNIME TUTORIAL Anna Monreale KDD-Lab, University of Pisa Email: [email protected] Outline Introduction on KNIME KNIME components Exercise: Market Basket Analysis Exercise: Customer Segmentation Exercise:
Search and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov
Search and Data Mining: Techniques Applications Anya Yarygina Boris Novikov Introduction Data mining applications Data mining system products and research prototypes Additional themes on data mining Social
Budapest University of Technology and Economics Department of Measurement and Information Systems. Business Process Modeling
Budapest University of Technology and Economics Department of Measurement and Information Systems Business Process Modeling Process, business process Workflow: sequence of given steps executed in order
Next Generation Sequencing
Next Generation Sequencing Technology and applications 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 1 Landmarks in DNA sequencing 1953 Discovery of DNA double helix structure 1977
Intro to Bioinformatics
Intro to Bioinformatics Marylyn D Ritchie, PhD Professor, Biochemistry and Molecular Biology Director, Center for Systems Genomics The Pennsylvania State University Sarah A Pendergrass, PhD Research Associate
Molecular typing of VTEC: from PFGE to NGS-based phylogeny
Molecular typing of VTEC: from PFGE to NGS-based phylogeny Valeria Michelacci 10th Annual Workshop of the National Reference Laboratories for E. coli in the EU Rome, November 5 th 2015 Molecular typing
